Parsing Left Associative Binary Operators

We will further extend our grammar (See Session 24, Page 3). Before that we will simplify its implementation in parse.c. You certainly have noticed for the production rules

\[\begin{array}{rcl}\text{equality-expr} & = & \text{relational-expr}\; \{\; (\; \texttt{"=="}\; |\; \texttt{"!="}\; )\; \text{relational-expr} \} \\\text{relational-expr} & = & \text{additive-expr}\; \{\; (\; \texttt{">"}\; |\; \texttt{"<"}\; |\; \texttt{">="}\; |\; \texttt{"<="}\; )\; \text{additive-expr} \} \\\text{additive-expr} & = & \text{multiplicative-expr}\; \{\; (\; \texttt{"+"}\; |\; \texttt{"-"}\; )\; \text{multiplicative-expr} \} \\\text{multiplicative-expr} & = & \text{unary-expr}\; \{\; (\; \texttt{"*"}\; |\; \texttt{"/"}\; |\; \texttt{"%"} )\; \text{unary-expr} \} \\\end{array}\]

the implementation of the corresponding parse functions have the same pattern. These rules have in common that they are production rules for left associative binary operators. We will exploit this an will replace parseEqualExpr(), parseRelExpr(), parseAdditiveExpr() and parseMultiplicativeExpr() with a single function parseLeftAssocBinaryExpr().

Precedence Levels for Left Associative Operators

The grammar specifies the different precedences of the operators. Operators that occur first in a production rule have the lowest precedence. Below the precedence levels are noted in table form analogously to the corresponding precedence levels in C (See Session 8, Page 4).

Precedence

Operators

Meaning

13

* (multiply)

/ (divide)

% (modulo)

Multiplicative

12

+ (add)

- (subtract)

Additive

10

< (less)

> (greater)

<= (less equal)

>= (greater equal)

Relation

9

==

!=

Equality

Inequality

For the implementation of a common parse function it is convenient to implement two auxiliary functions. Function tokenKindPrec() will be used to obtain the precedence form a given token kind, and function makeBinaryExprKind() will be used to translate the token kind into a corresponding expression kind. Both functions later can be generated from a textual description.

Obtain Precedence from Token Kinds

If a token kind represents one of the operators in the above table function tokenKindPrec() returns its precedence which will always be a positive value. Otherwise it returns zero:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
static int
tokenKindPrec(enum TokenKind kind)
{
    switch (kind) {
        case ASTERISK:
        case SLASH:
        case PERCENT:
            return 13;
        case PLUS:
        case MINUS:
            return 12;
        /*
            TODO:
            handle all other cases of left associative binary operators
        */
        default:
            return 0;
    }
}

Complete the implementation by handling in the switch statement all other token kinds that represent a left associative binary operator.

Translating Token Kinds into Expression Kinds

For creating binary expressions the token kind needs to be translated into an expression kind. This is implemented in makeBinaryExprKind() and it is an internal error if it received a parameter that can not be translated:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
static enum ExprKind
makeBinaryExprKind(enum TokenKind kind)
{
    switch (kind) {
        case ASTERISK:
            return EK_MUL;
        case SLASH:
            return EK_DIV;
        case PERCENT:
            return EK_MOD;
        case PLUS:
            return EK_ADD;
        case MINUS:
            return EK_SUB;
        /*
            TODO:
            handle all other cases of left associative binary operators
        */
        default:
            fprintf(stderr, "makeBinaryExprKind: token kind %s (%d) not "
                    "handled\n", strTokenKind(kind), kind);
            assert(0);
            return 0;
    }
}

Complete the implementation by handling all relevant token kinds, i.e. all token kinds where tokenKindPrec() returns a positive value.

Parse Function for Left Associative Binary Operators

Now consider the following implementation of parseLeftAssocBinaryExpr():

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
    if (prec > 13) {
        return parseUnaryExpr();
    } else {
        const struct Expr *expr = parseBinaryExpr(prec + 1);
        while (tokenKindPrec(token.kind) == prec) {
            enum ExprKind op = makeBinaryExprKind(token.kind);
            getToken();
            expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(prec + 1));
        }
        return expr;
    }
}

Convince yourself that for example parseLeftAssocBinaryExpr(9) is equivalent to parseEqualExpr()

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
static const struct Expr *
parseEqualExpr(void)
{
    const struct Expr *expr = parseRelExpr();
    while (token.kind == EQUAL2 || token.kind == NOT_EQUAL) {
        enum TokenKind op = token.kind;
        getToken();
        const struct Expr *exprRight = parseRelExpr();
        if (op == EQUAL2) {
            expr = newBinaryExpr(EK_EQUAL, expr, exprRight);
        } else if (op == NOT_EQUAL) {
            expr = newBinaryExpr(EK_NOT_EQUAL, expr, exprRight);
        }
    }
    return expr;
}

and calling parseLeftAssocBinaryExpr(14) is equivalent to calling parseUnaryExpr(). It is also no problem that we have “gaps” in the precedence levels, e.g. we have no operator with a precedence level of 10.

Examples

For illustrating how parseLeftAssocBinaryExpr() works the implementation was modified so that we can see its call stack:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
    printf("%*sparseBinaryExpr(%d)\n", prec, "", prec);
    if (prec > 13) {
        printf("%*s->parseUnaryExpr()\n", prec, "");
        return parseUnaryExpr();
    } else {
        const struct Expr *expr = parseLeftAssocBinaryExpr(prec + 1);
        while (tokenKindPrec(token.kind) == prec) {
            enum TokenKind opTokenKind = token.kind;
            enum ExprKind op = makeBinaryExprKind(token.kind);
            getToken();

            printf("%*s->newBinaryExpr(...) for operator %s\n", prec, "",
                   strTokenKind(opTokenKind));
            expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(prec + 1));
        }
        return expr;
    }
}

a + b * (c + d)

Here the call stack when the expression a + b * (c + d) gets parsed:

theon$ echo "a + b * (c + d);" | ./xtest_abc foo.s foo.tex
 parseBinaryExpr(1)
  parseBinaryExpr(2)
   parseBinaryExpr(3)
    parseBinaryExpr(4)
     parseBinaryExpr(5)
      parseBinaryExpr(6)
       parseBinaryExpr(7)
        parseBinaryExpr(8)
         parseBinaryExpr(9)
          parseBinaryExpr(10)
           parseBinaryExpr(11)
            parseBinaryExpr(12)
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
             ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
              ->parseUnaryExpr()
 parseBinaryExpr(1)
  parseBinaryExpr(2)
   parseBinaryExpr(3)
    parseBinaryExpr(4)
     parseBinaryExpr(5)
      parseBinaryExpr(6)
       parseBinaryExpr(7)
        parseBinaryExpr(8)
         parseBinaryExpr(9)
          parseBinaryExpr(10)
           parseBinaryExpr(11)
            parseBinaryExpr(12)
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
theon$ 

z == x * ( y + z) / a + 5 + (x == y)

Here the call stack when the expression z == x * ( y + z) / a + 5 + (x == y) gets parsed:

theon$ echo "z == x * ( y + z) / a + 5 + (x == y);" | ./xtest_abc foo.s foo.tex
 parseBinaryExpr(1)
  parseBinaryExpr(2)
   parseBinaryExpr(3)
    parseBinaryExpr(4)
     parseBinaryExpr(5)
      parseBinaryExpr(6)
       parseBinaryExpr(7)
        parseBinaryExpr(8)
         parseBinaryExpr(9)
          parseBinaryExpr(10)
           parseBinaryExpr(11)
            parseBinaryExpr(12)
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
          ->newBinaryExpr(...) for operator EQUAL2
           parseBinaryExpr(11)
            parseBinaryExpr(12)
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
             ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
              ->parseUnaryExpr()
 parseBinaryExpr(1)
  parseBinaryExpr(2)
   parseBinaryExpr(3)
    parseBinaryExpr(4)
     parseBinaryExpr(5)
      parseBinaryExpr(6)
       parseBinaryExpr(7)
        parseBinaryExpr(8)
         parseBinaryExpr(9)
          parseBinaryExpr(10)
           parseBinaryExpr(11)
            parseBinaryExpr(12)
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
             ->newBinaryExpr(...) for operator SLASH
              parseBinaryExpr(14)
              ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
 parseBinaryExpr(1)
  parseBinaryExpr(2)
   parseBinaryExpr(3)
    parseBinaryExpr(4)
     parseBinaryExpr(5)
      parseBinaryExpr(6)
       parseBinaryExpr(7)
        parseBinaryExpr(8)
         parseBinaryExpr(9)
          parseBinaryExpr(10)
           parseBinaryExpr(11)
            parseBinaryExpr(12)
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
          ->newBinaryExpr(...) for operator EQUAL2
           parseBinaryExpr(11)
            parseBinaryExpr(12)
             parseBinaryExpr(13)
              parseBinaryExpr(14)
              ->parseUnaryExpr()
theon$ 

Optimization of parseLeftAssocBinaryExpr

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
    const struct Expr *expr = parseUnaryExpr();
    for (int p = tokenKindPrec(token.kind); p >= prec; --p) {
        while (tokenKindPrec(token.kind) == p) {
            enum ExprKind op = makeBinaryExprKind(token.kind);
            getToken();
            expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(p + 1));
        }
    }
    return expr;
}

For visualizing the call stack it was modified to:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
    printf("%*sparseBinaryExpr(%d)\n", prec, "", prec);
    const struct Expr *expr = parseUnaryExpr();
    printf("%*s->parseUnaryExpr()\n", prec, "");
    for (int p = tokenKindPrec(token.kind); p >= prec; --p) {
        while (tokenKindPrec(token.kind) == p) {
            enum TokenKind opTokenKind = token.kind;
            enum ExprKind op = makeBinaryExprKind(token.kind);
            getToken();
            printf("%*s->newBinaryExpr(...) for operator %s\n", prec, "",
                   strTokenKind(opTokenKind));
            expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(p + 1));
        }
    }
    return expr;
}

a + b * (c + d)

Here the call stack when the expression a + b * (c + d) gets parsed:

theon$ echo "a + b * (c + d);" | ./xtest_abc foo.s foo.tex
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
             ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
              ->parseUnaryExpr()
theon$ 

z == x * ( y + z) / a + 5 + (x == y)

Here the call stack when the expression z == x * ( y + z) / a + 5 + (x == y) gets parsed:

theon$ echo "z == x * ( y + z) / a + 5 + (x == y);" | ./xtest_abc foo.s foo.tex
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator EQUAL2
           parseBinaryExpr(11)
           ->parseUnaryExpr()
           ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
              ->parseUnaryExpr()
           ->newBinaryExpr(...) for operator SLASH
              parseBinaryExpr(14)
              ->parseUnaryExpr()
           ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
           ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator EQUAL2
           parseBinaryExpr(11)
           ->parseUnaryExpr()
             ->parseUnaryExpr()
theon$ 

Exercise

Implement parseLeftAssocBinaryExpr() and the auxiliary functions as described above. Do not forget to remove the forward declarations of the obsolete parse functions. In parseAssignmentExpr() call parseLeftAssocBinaryExpr(1) instead of parseLeftAssocBinaryExpr(9). Then nothing needs to be changed in this place when new left associative binary operators will be added.

Write a test program for the ABC compiler. For example (but feel free to make it more extensive):

1
2
3
4
5
6
7
a + b * (c + d);
x + 1 == y;
x + 1 != y * z;
x + 1 <= y * z;
x + 1 >= y * z;
x + 1 < y * z;
x + 1 > y * z;

and check the expression trees in the generated Latex. In this case

theon$ ./xtest_abc test.s test.tex < test.abc
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
             ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
              ->parseUnaryExpr()
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator EQUAL2
           parseBinaryExpr(11)
           ->parseUnaryExpr()
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator NOT_EQUAL
           parseBinaryExpr(11)
           ->parseUnaryExpr()
           ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
              ->parseUnaryExpr()
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator LESS_EQUAL
            parseBinaryExpr(12)
            ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
              ->parseUnaryExpr()
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator GREATER_EQUAL
            parseBinaryExpr(12)
            ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
              ->parseUnaryExpr()
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator LESS
            parseBinaryExpr(12)
            ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
              ->parseUnaryExpr()
 parseBinaryExpr(1)
 ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator PLUS
             parseBinaryExpr(13)
             ->parseUnaryExpr()
 ->newBinaryExpr(...) for operator GREATER
            parseBinaryExpr(12)
            ->parseUnaryExpr()
            ->newBinaryExpr(...) for operator ASTERISK
              parseBinaryExpr(14)
              ->parseUnaryExpr()
theon$ lualatex test.tex > /dev/null
theon$ 

you should get this pdf.