Parsing Left Associative Binary Operators
We will further extend our grammar (See Session 24, Page 3). Before that we will simplify its implementation in parse.c. You certainly have noticed for the production rules
\[\begin{array}{rcl}\text{equality-expr} & = & \text{relational-expr}\; \{\; (\; \texttt{"=="}\; |\; \texttt{"!="}\; )\; \text{relational-expr} \} \\\text{relational-expr} & = & \text{additive-expr}\; \{\; (\; \texttt{">"}\; |\; \texttt{"<"}\; |\; \texttt{">="}\; |\; \texttt{"<="}\; )\; \text{additive-expr} \} \\\text{additive-expr} & = & \text{multiplicative-expr}\; \{\; (\; \texttt{"+"}\; |\; \texttt{"-"}\; )\; \text{multiplicative-expr} \} \\\text{multiplicative-expr} & = & \text{unary-expr}\; \{\; (\; \texttt{"*"}\; |\; \texttt{"/"}\; |\; \texttt{"%"} )\; \text{unary-expr} \} \\\end{array}\]the implementation of the corresponding parse functions have the same pattern. These rules have in common that they are production rules for left associative binary operators. We will exploit this an will replace parseEqualExpr(), parseRelExpr(), parseAdditiveExpr() and parseMultiplicativeExpr() with a single function parseLeftAssocBinaryExpr().
Precedence Levels for Left Associative Operators
The grammar specifies the different precedences of the operators. Operators that occur first in a production rule have the lowest precedence. Below the precedence levels are noted in table form analogously to the corresponding precedence levels in C (See Session 8, Page 4).
Precedence |
Operators |
Meaning |
13 |
* (multiply) / (divide) % (modulo) |
Multiplicative |
12 |
+ (add) - (subtract) |
Additive |
10 |
< (less) > (greater) <= (less equal) >= (greater equal) |
Relation |
9 |
== != |
Equality Inequality |
For the implementation of a common parse function it is convenient to implement two auxiliary functions. Function tokenKindPrec() will be used to obtain the precedence form a given token kind, and function makeBinaryExprKind() will be used to translate the token kind into a corresponding expression kind. Both functions later can be generated from a textual description.
Obtain Precedence from Token Kinds
If a token kind represents one of the operators in the above table function tokenKindPrec() returns its precedence which will always be a positive value. Otherwise it returns zero:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | static int
tokenKindPrec(enum TokenKind kind)
{
switch (kind) {
case ASTERISK:
case SLASH:
case PERCENT:
return 13;
case PLUS:
case MINUS:
return 12;
/*
TODO:
handle all other cases of left associative binary operators
*/
default:
return 0;
}
}
|
Complete the implementation by handling in the switch statement all other token kinds that represent a left associative binary operator.
Translating Token Kinds into Expression Kinds
For creating binary expressions the token kind needs to be translated into an expression kind. This is implemented in makeBinaryExprKind() and it is an internal error if it received a parameter that can not be translated:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | static enum ExprKind
makeBinaryExprKind(enum TokenKind kind)
{
switch (kind) {
case ASTERISK:
return EK_MUL;
case SLASH:
return EK_DIV;
case PERCENT:
return EK_MOD;
case PLUS:
return EK_ADD;
case MINUS:
return EK_SUB;
/*
TODO:
handle all other cases of left associative binary operators
*/
default:
fprintf(stderr, "makeBinaryExprKind: token kind %s (%d) not "
"handled\n", strTokenKind(kind), kind);
assert(0);
return 0;
}
}
|
Complete the implementation by handling all relevant token kinds, i.e. all token kinds where tokenKindPrec() returns a positive value.
Parse Function for Left Associative Binary Operators
Now consider the following implementation of parseLeftAssocBinaryExpr():
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
if (prec > 13) {
return parseUnaryExpr();
} else {
const struct Expr *expr = parseBinaryExpr(prec + 1);
while (tokenKindPrec(token.kind) == prec) {
enum ExprKind op = makeBinaryExprKind(token.kind);
getToken();
expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(prec + 1));
}
return expr;
}
}
|
Convince yourself that for example parseLeftAssocBinaryExpr(9) is equivalent to parseEqualExpr()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | static const struct Expr *
parseEqualExpr(void)
{
const struct Expr *expr = parseRelExpr();
while (token.kind == EQUAL2 || token.kind == NOT_EQUAL) {
enum TokenKind op = token.kind;
getToken();
const struct Expr *exprRight = parseRelExpr();
if (op == EQUAL2) {
expr = newBinaryExpr(EK_EQUAL, expr, exprRight);
} else if (op == NOT_EQUAL) {
expr = newBinaryExpr(EK_NOT_EQUAL, expr, exprRight);
}
}
return expr;
}
|
and calling parseLeftAssocBinaryExpr(14) is equivalent to calling parseUnaryExpr(). It is also no problem that we have “gaps” in the precedence levels, e.g. we have no operator with a precedence level of 10.
Examples
For illustrating how parseLeftAssocBinaryExpr() works the implementation was modified so that we can see its call stack:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
printf("%*sparseBinaryExpr(%d)\n", prec, "", prec);
if (prec > 13) {
printf("%*s->parseUnaryExpr()\n", prec, "");
return parseUnaryExpr();
} else {
const struct Expr *expr = parseLeftAssocBinaryExpr(prec + 1);
while (tokenKindPrec(token.kind) == prec) {
enum TokenKind opTokenKind = token.kind;
enum ExprKind op = makeBinaryExprKind(token.kind);
getToken();
printf("%*s->newBinaryExpr(...) for operator %s\n", prec, "",
strTokenKind(opTokenKind));
expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(prec + 1));
}
return expr;
}
}
|
a + b * (c + d)
Here the call stack when the expression a + b * (c + d) gets parsed:
theon$ echo "a + b * (c + d);" | ./xtest_abc foo.s foo.tex parseBinaryExpr(1) parseBinaryExpr(2) parseBinaryExpr(3) parseBinaryExpr(4) parseBinaryExpr(5) parseBinaryExpr(6) parseBinaryExpr(7) parseBinaryExpr(8) parseBinaryExpr(9) parseBinaryExpr(10) parseBinaryExpr(11) parseBinaryExpr(12) parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) ->parseUnaryExpr() parseBinaryExpr(1) parseBinaryExpr(2) parseBinaryExpr(3) parseBinaryExpr(4) parseBinaryExpr(5) parseBinaryExpr(6) parseBinaryExpr(7) parseBinaryExpr(8) parseBinaryExpr(9) parseBinaryExpr(10) parseBinaryExpr(11) parseBinaryExpr(12) parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() theon$
z == x * ( y + z) / a + 5 + (x == y)
Here the call stack when the expression z == x * ( y + z) / a + 5 + (x == y) gets parsed:
theon$ echo "z == x * ( y + z) / a + 5 + (x == y);" | ./xtest_abc foo.s foo.tex parseBinaryExpr(1) parseBinaryExpr(2) parseBinaryExpr(3) parseBinaryExpr(4) parseBinaryExpr(5) parseBinaryExpr(6) parseBinaryExpr(7) parseBinaryExpr(8) parseBinaryExpr(9) parseBinaryExpr(10) parseBinaryExpr(11) parseBinaryExpr(12) parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator EQUAL2 parseBinaryExpr(11) parseBinaryExpr(12) parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) ->parseUnaryExpr() parseBinaryExpr(1) parseBinaryExpr(2) parseBinaryExpr(3) parseBinaryExpr(4) parseBinaryExpr(5) parseBinaryExpr(6) parseBinaryExpr(7) parseBinaryExpr(8) parseBinaryExpr(9) parseBinaryExpr(10) parseBinaryExpr(11) parseBinaryExpr(12) parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator SLASH parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() parseBinaryExpr(1) parseBinaryExpr(2) parseBinaryExpr(3) parseBinaryExpr(4) parseBinaryExpr(5) parseBinaryExpr(6) parseBinaryExpr(7) parseBinaryExpr(8) parseBinaryExpr(9) parseBinaryExpr(10) parseBinaryExpr(11) parseBinaryExpr(12) parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator EQUAL2 parseBinaryExpr(11) parseBinaryExpr(12) parseBinaryExpr(13) parseBinaryExpr(14) ->parseUnaryExpr() theon$
Optimization of parseLeftAssocBinaryExpr
1 2 3 4 5 6 7 8 9 10 11 12 13 | static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
const struct Expr *expr = parseUnaryExpr();
for (int p = tokenKindPrec(token.kind); p >= prec; --p) {
while (tokenKindPrec(token.kind) == p) {
enum ExprKind op = makeBinaryExprKind(token.kind);
getToken();
expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(p + 1));
}
}
return expr;
}
|
For visualizing the call stack it was modified to:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | static const struct Expr *
parseLeftAssocBinaryExpr(int prec)
{
printf("%*sparseBinaryExpr(%d)\n", prec, "", prec);
const struct Expr *expr = parseUnaryExpr();
printf("%*s->parseUnaryExpr()\n", prec, "");
for (int p = tokenKindPrec(token.kind); p >= prec; --p) {
while (tokenKindPrec(token.kind) == p) {
enum TokenKind opTokenKind = token.kind;
enum ExprKind op = makeBinaryExprKind(token.kind);
getToken();
printf("%*s->newBinaryExpr(...) for operator %s\n", prec, "",
strTokenKind(opTokenKind));
expr = newBinaryExpr(op, expr, parseLeftAssocBinaryExpr(p + 1));
}
}
return expr;
}
|
a + b * (c + d)
Here the call stack when the expression a + b * (c + d) gets parsed:
theon$ echo "a + b * (c + d);" | ./xtest_abc foo.s foo.tex parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->parseUnaryExpr() theon$
z == x * ( y + z) / a + 5 + (x == y)
Here the call stack when the expression z == x * ( y + z) / a + 5 + (x == y) gets parsed:
theon$ echo "z == x * ( y + z) / a + 5 + (x == y);" | ./xtest_abc foo.s foo.tex parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator EQUAL2 parseBinaryExpr(11) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->parseUnaryExpr() ->newBinaryExpr(...) for operator SLASH parseBinaryExpr(14) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator EQUAL2 parseBinaryExpr(11) ->parseUnaryExpr() ->parseUnaryExpr() theon$
Exercise
Implement parseLeftAssocBinaryExpr() and the auxiliary functions as described above. Do not forget to remove the forward declarations of the obsolete parse functions. In parseAssignmentExpr() call parseLeftAssocBinaryExpr(1) instead of parseLeftAssocBinaryExpr(9). Then nothing needs to be changed in this place when new left associative binary operators will be added.
Write a test program for the ABC compiler. For example (but feel free to make it more extensive):
1 2 3 4 5 6 7 | a + b * (c + d);
x + 1 == y;
x + 1 != y * z;
x + 1 <= y * z;
x + 1 >= y * z;
x + 1 < y * z;
x + 1 > y * z;
|
and check the expression trees in the generated Latex. In this case
theon$ ./xtest_abc test.s test.tex < test.abc parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->parseUnaryExpr() parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator EQUAL2 parseBinaryExpr(11) ->parseUnaryExpr() parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator NOT_EQUAL parseBinaryExpr(11) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) ->parseUnaryExpr() parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator LESS_EQUAL parseBinaryExpr(12) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) ->parseUnaryExpr() parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator GREATER_EQUAL parseBinaryExpr(12) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) ->parseUnaryExpr() parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator LESS parseBinaryExpr(12) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) ->parseUnaryExpr() parseBinaryExpr(1) ->parseUnaryExpr() ->newBinaryExpr(...) for operator PLUS parseBinaryExpr(13) ->parseUnaryExpr() ->newBinaryExpr(...) for operator GREATER parseBinaryExpr(12) ->parseUnaryExpr() ->newBinaryExpr(...) for operator ASTERISK parseBinaryExpr(14) ->parseUnaryExpr() theon$ lualatex test.tex > /dev/null theon$
you should get this pdf.