Break and Continue Statements
For control structures we easily can support break and continue statements. In the grammar we extend the production rule for a statements as follows:
\[\begin{array}{rcl}\text{statement} & = & \text{io-hack-statement}\; \\ & | & \text{expr-statement}\; \\ & | & \text{compound-statement}\; \\ & | & \text{while-statement}\; \\ & | & \text{do-while-statement}\; \\ & | & \text{for-statement}\; \\ & | & \text{if-statement}\; \\ & | & \text{break-statement}\; \\ & | & \text{continue-statement}\; \\\text{break-statement} & = & \texttt{break}\; \texttt{";"} \\\text{continue-statement} & = & \texttt{continue}\; \texttt{";"} \\\end{array}\]We therefore need two more keywords and two more parse functions. Before we think about semantics and code generation let's just fix the parser that accepts these two new statements but neither checks the semantics nor generates any code
Updating the Parser
Adding new keywords is simple. Just add these two lines to tokenkind.txt:
1 2 | BREAK break
CONTINUE continue
|
In parse.c before the definition of parseStatement() we need two more forward declarations for parsing break and continue statements. And parseStatement() also tries to parse one of these new statements:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | // for parsing statements
/* ... as before ... */
static bool parseBreakStatement(void);
static bool parseContinueStatement(void);
/* ... as before ... */
static bool
parseStatement(void)
{
if (parseIOHackStatement()) {
return true;
/* ... as before ... */
} else if (parseBreakStatement()) {
return true;
} else if (parseContinueStatement()) {
return true;
} else {
return false;
}
}
|
The initial implementation of parseBreakStatement() and parseContinueStatement() just checks the grammar and consumes all tokens if a corresponding statement could be parsed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | static bool
parseBreakStatement(void)
{
if (token.kind != BREAK) {
return false;
}
getToken();
expected(SEMICOLON);
getToken();
return true;
}
static bool
parseContinueStatement(void)
{
if (token.kind != CONTINUE) {
return false;
}
getToken();
expected(SEMICOLON);
getToken();
return true;
}
|
Intermediate Test
You now can test your code for typos. If everything compiles you also can check if the compiler accepts some test programs.
This program is syntactically correct but has a semantic error:
The same is the case for this program:
This program is syntactically and semantically correct:
1 2 3 4 5 6 7 8 9 10 11 12 | i = 0;
while (1) {
i = i + 1;
if (i % 2 == 0) {
continue;
}
$< i;
if (i > 10) {
break;
}
}
|
In its current form the compiler should compile all of these programs (but the code is of course for the trash).
Semantic Check and Code Generation
Both statements are only allowed within a loop. If this is not the case using one of these statements is a semantic error. Code generation depends on the type of loop that contains such a statement.
Stack for Break and Continue Labels
Every kind of loop has its specific break and continue label. When a loop gets parsed and code generated these labels can be pushed onto a stack (before statements for the loop body will be parsed). Before the parse function returns the labels need to be popped from the stack. So we need functions that implement push and pop operations for a pair of labels, e.g.
1 2 | static void pushLabelPair(const char *breakLabel, const char *continueLabel);
static void popLabelPair(void);
|
Break and continue statements are semantically incorrect if the stack is empty. Otherwise they generate an unconditional jump to the corresponding label that is currently at the top of the stack. It therefore is convenient to have two functions like
1 2 | static const char *topBreakLabel(void);
static const char *topContinueLabel(void);
|
that either return a null pointer if the stack is empty and otherwise a pointer to the corresponding label.
These four functions can be implemented in parse.c as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | static struct LabelPairNode
{
struct LabelPairNode *next;
const char *breakLabel;
const char *continueLabel;
} *labelPairTop;
static void
pushLabelPair(const char *breakLabel, const *continueLabel)
{
struct LabelPairNode *n = malloc(sizeof(*n));
if (!n) {
fprintf(stderr, "pushLabel: out of memory\n");
finalizeExit(1);
}
n->next = labelPairTop;
n->breakLabel = breakLabel;
n->continueLabel = continueLabel;
labelPairTop = n;
}
static const char *
topBreakLabel(void)
{
return labelPairTop ? labelPairTop->breakLabel : 0;
}
static const char *
topContinueLabel(void)
{
return labelPairTop ? labelPairTop->continueLabel : 0;
}
static void
popLabelPair(void)
{
assert(labelPairTop);
struct LabelPairNode *n = labelPairTop;
labelPairTop = labelPairTop->next;
free(n);
}
|
Break and Continue in While Loops
The code generation for while loops was in the last session based on this flow chart:
Hence the continue label equals the label for the conditional jump instruction, and the break label equals the label marking the end of the generated code block. Therefore the parse function for a while statements has to push these labels before statements for the loop body are parsed. And it has to pop the labels before it returns:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | static bool
parseWhileStatement(void)
{
if (token.kind != WHILE) {
return false;
}
getToken();
expected(LPAREN);
getToken();
const struct Expr *cond = parseAssignmentExpr();
if (!cond) {
expectedError("assignment expression");
}
const char *endLabel = genGetLabel();
const char *condLabel = genGetLabel();
pushLabelPair(endLabel, condLabel);
genLabelDef(condLabel);
GenReg dest = genGetReg();
condJmpExpr(cond, dest, 0, endLabel);
genUngetReg(dest);
expected(RPAREN);
getToken();
if (!parseCompoundStatement()) {
expectedError("compound statement");
}
genJmp(condLabel);
genLabelDef(endLabel);
popLabelPair();
return true;
}
|
Code Generation for Break and Continue Statements
The code generation and semantics checks for these statements is now simple. If the stack is empty a semantic error was detected. Otherwise an unconditional jump gets generated:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | static bool
parseBreakStatement(void)
{
if (token.kind != BREAK) {
return false;
}
struct TokenPos pos = token.pos;
getToken();
expected(SEMICOLON);
getToken();
const char *label = topBreakLabel();
if (!label) {
errorAtPos(pos, "break is not inside a loop");
}
genJmp(label);
return true;
}
static bool
parseContinueStatement(void)
{
if (token.kind != CONTINUE) {
return false;
}
struct TokenPos pos = token.pos;
getToken();
expected(SEMICOLON);
getToken();
const char *label = topContinueLabel();
if (!label) {
errorAtPos(pos, "continue is not inside a loop");
}
genJmp(label);
return true;
}
|
Tests
The compiler should now detect the semantic errors in
theon$ make error_continue ../xtest_abc error_continue.s < error_continue.abc 1.1: continue is not inside a loop make: *** [Makefile:17: error_continue.s] Error 1 rm error_continue.s theon$ make error_break ../xtest_abc error_break.s < error_break.abc 1.1: break is not inside a loop make: *** [Makefile:17: error_break.s] Error 1 rm error_break.s theon$
But if one of these statements occurs within a while loop now everything should work:
1 2 3 4 5 6 7 8 9 10 11 12 | i = 0;
while (1) {
i = i + 1;
if (i % 2 == 0) {
continue;
}
$< i;
if (i > 10) {
break;
}
}
|
theon$ make break_continue ../xtest_abc break_continue.s < break_continue.abc ../../../ulm-generator/1_ulm_build/stack/ulmas break_continue.s getuint64.s printuint64.s (echo "#! ../../../ulm-generator/1_ulm_build/stack/ulm"; cat a.out) > break_continue rm -f a.out chmod +x break_continue rm break_continue.s theon$ ./break_continue 1 3 5 7 9 11 theon$
Exercise
Also allow break and continue statements within for loops and while loops.