BAHA: Supporting While Loop Statements

After this long preparation it is time to see some notable results. The compiler should support loops. The first loop type will be a while loop. For seeing results fast the bad ass hacker approach (BAHA) is chosen. The goal is that the program

1
2
3
4
5
6
7
8
$> n;
res = 1;
i = 2;
while (i <= n) {
    res = res * i;
    i = i + 1;
}
$< res;

can be used to compute the factorial of a non-negative number. Other things that should also work will cause assertion failures. These things will be fixed in the aftermath.

Update of the Grammar

It will be required that the loop body has to be a compound statement. In C the loop body can be an expression statement like in

1
2
while (i <= n) 
    i = i + 1;

Our compiler will raise a syntax error. So my personal coding style is part of the grammar. Here the production rule for statements was changed and two new production rules added:

\[\begin{array}{rcl}\text{statement} & = & \text{io-hack-statement}\; \\ & | & \text{expr-statement}\; \\ & | & \text{compound-statement}\; \\ & | & \text{while-statement}\; \\\text{compound-statement} & = & \texttt{"\{"}\; \{\; \text{statement} \}\; \texttt{"\}"}\; \\\text{while-statement} & = & \texttt{while}\; \texttt{"("}\; \text{assignment-expression} \texttt{")"}\; \text{compound-statement} \\\end{array}\]

Parsing

Updating the parser is mainly a matter of organizing the code and some effort. There will be two new parse functions (For parsing compound statements and while statements). So we can extend the list of forward declarations

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// for parsing statements
static bool parseStatement(void);
static bool parseIOHackStatement(void);
static bool parseExprStatement(void);
static bool parseCompoundStatement(void);
static bool parseWhileStatement(void);

// for parsing expressions
static const struct Expr *parseAssignmentExpr(void);
static const struct Expr *parseLeftAssocBinaryExpr(int prec);
static const struct Expr *parseUnaryExpr(void);
static const struct Expr *parsePrimaryExpr(void);

and when parsing a statement two more cases need to be treated:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
static bool
parseStatement(void)
{
    if (parseIOHackStatement()) {
        return true;
    } else if (parseExprStatement()) {
        return true;
    } else if (parseCompoundStatement()) {
        return true;
    } else if (parseWhileStatement()) {
        return true;
    } else {
        return false;
    }
}

Parsing Compound Statements

Parsing compound statements is straight forward:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
static bool
parseCompoundStatement(void)
{
    // "{"
    if (token.kind != LBRACE) {
        return false;
    }

    // { statement }
    getToken();
    while (parseStatement()) {
    }

    // "}"
    expected(RBRACE);
    getToken();
    return true;
}

Note that by parsing a statement also code will be generated for each statement. So the implementation of parseCompoundStatement() is already complete.

Parsing While Statements

Parsing a while statement is also straight forward

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
static bool
parseWhileStatement(void)
{
    // while
    if (token.kind != WHILE) {
        return false;
    }
    getToken();

    // "("
    expected(LPAREN);
    getToken();

    // assignment-expr
    const struct Expr *cond = parseAssignmentExpr();
    if (!cond) {
        expectedError("assignment expression");
    }

    // ")"
    expected(RPAREN);
    getToken();

    // compound-statement
    if (! parseCompoundStatement()) {
        expectedError("compound statement");
    }
    return true;
}

However, for generating code some additional considerations are required. And most of all, some modifications in expr.h and expr.c.

So for a moment we just implement the parsing function and deal with the code generation later. This will allow us to test the parser before we continue.

Testing the Parser

The program

1
2
3
4
5
6
7
8
$> n;
res = 1;
i = 2;
while (i <= n) {
    res = res * i;
    i = i + 1;
}
$< res;

should now compile without syntax error or assertion failure:

theon$ ./xtest_abc while_factorial.s < while_factorial.abc
theon$ 

In a next step correct assembly code needs to be generated for while loops.

Code Generation

Code for control structures requires conditional jumps. Function condJmpExpr() in expr.c is providing exactly what is needed. It can be used to generate a conditional jump if a condition is either true (when trueLabel is not a null pointer) or false (when falseLabel is not a null pointer). Currently it is declared as static:

1
2
3
4
5
6
static void
condJmpExpr(const struct Expr *expr, GenReg dest, const char *trueLabel,
            const char *falseLabel)
{
    // ...
}

Now we remove the static keyword in expr.c and also declare it in expr.h.

The Bad Ass Hackers Approach

Note that calling condJmpExpr() results in an assertion failure if it gets an expression tree where the root note is neither an equality nor relational expression node. In our test program the root has the expression kind EK_EQUAL. So for now everything is fine and we deal with the rest later.

Choosing a Variant for Code Generation

There are two variants for realizing a while loop in assembly (See Session 10, Page 2). One of them jumps forward to the end of the loop if the condition is false. For assembly programming the condition had to be negated because conditional jump instructions only trigger a jump if a condition is true. Here however, the code generation interface deals with negating the condition when a “false label” is used instead of a “true label”. So the flow chart can be expressed as follows:

The flow chart now describes for in parseWhileStatement() code should be generated:

  • We need a label (e.g. cond) which gets defined before code for the condition expression and the loop body statements gets generated.

  • We need a label (e.g. end) which gets defined after the code generated for the loop body.

  • From the condition expression a conditional jump needs to be generated that jumps to the end of the loop body if the condition is false.

This can be implemented as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
static bool
parseWhileStatement(void)
{
    if (token.kind != WHILE) {
        return false;
    }
    getToken();
    expected(LPAREN);
    getToken();
    const struct Expr *cond = parseAssignmentExpr();
    if (!cond) {
        expectedError("assignment expression");
    }
    const char *endLabel = genGetLabel();
    const char *condLabel = genGetLabel();

    // label for the condition
    genLabelDef(condLabel);

    // generate conditional jump from the condition expression
    GenReg dest = genGetReg();
    condJmpExpr(cond, dest, 0, endLabel);
    genUngetReg(dest);

    expected(RPAREN);
    getToken();
    if (! parseCompoundStatement()) {
        expectedError("compound statement");
    }

    // jump back to check the condition
    genJmp(condLabel);

    // mark the end of the loop
    genLabelDef(endLabel);
    return true;
}

Test: Computing the Factorial

So back to our initial example:

1
2
3
4
5
6
7
8
$> n;
res = 1;
i = 2;
while (i <= n) {
    res = res * i;
    i = i + 1;
}
$< res;

This should now work fine:

theon$ make while_factorial
../xtest_abc while_factorial.s < while_factorial.abc
../../../ulm-generator/1_ulm_build/stack/ulmas while_factorial.s getuint64.s printuint64.s
(echo "#! ../../../ulm-generator/1_ulm_build/stack/ulm"; cat a.out) > while_factorial
rm -f a.out
chmod +x while_factorial
rm while_factorial.s
theon$ echo 5 | ./while_factorial
120
theon$ 

You don't like multiplications? And you prefer complicated over simple? Then use this little beast for computing the factorial:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$> n;
res = 1;
i = 2;
while (i <= n) {
    tmp = res;
    j = i;
    while (j > 1) {
        res = res + tmp;
        j = j - 1;
    }
    i = i + 1;
}
$< res;

Nested while loops are no problemo:

theon$ make while_factorial2
../xtest_abc while_factorial2.s < while_factorial2.abc
../../../ulm-generator/1_ulm_build/stack/ulmas while_factorial2.s getuint64.s printuint64.s
(echo "#! ../../../ulm-generator/1_ulm_build/stack/ulm"; cat a.out) > while_factorial2
rm -f a.out
chmod +x while_factorial2
theon$ echo 5 | ./while_factorial2
120
theon$ 

Things to Fix

Now consider this program:

1
2
3
4
5
6
7
$> n;
res = 1;
while (n) {
    res = res * n;
    n = n - 1;
}
$< res;

Of course this should also work:

theon$ make while_factorial3
../xtest_abc while_factorial3.s < while_factorial3.abc
../../../ulm-generator/1_ulm_build/stack/ulmas while_factorial3.s getuint64.s printuint64.s
(echo "#! ../../../ulm-generator/1_ulm_build/stack/ulm"; cat a.out) > while_factorial3
rm -f a.out
chmod +x while_factorial3
rm while_factorial3.s
theon$ echo 5 | ./while_factorial3
120
theon$ 

However, by my calculations you should get an assertion failure with your current code. The problem should be that currently condJmpExpr() only handles a few expression kinds:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
void
condJmpExpr(const struct Expr *expr, GenReg dest, const char *trueLabel,
            const char *falseLabel)
{
    // ...

    switch (expr->kind) {
        case EK_LESS:
        case EK_LESS_EQUAL:
        case EK_GREATER:
        case EK_GREATER_EQUAL:
        case EK_EQUAL:
        case EK_NOT_EQUAL:
            {
                // ...
            }
            return;
        default:
            assert(0);
    }
}

The assertion failure gets triggered in the default case. However, in the default case you simply can load the expression into the destination register and then generate a conditional jump depending on whether the value of the expression is non-zero (true) or zero (false):

1
2
3
loadExpr(expr, dest);
genOp2i(GEN_CMP_I, 0, dest);
genCondJmp(GEN_NOT_EQUAL, trueLabel, falseLabel);