======================


Another Grammar Update							[TOC]
======================

There actually will be more than just one update. The first update will allow
empty expression statements like in C. Currently this program

---- CODE (file=session25/git/abc_step1/empty_expr_statement.abc) --------------
;
--------------------------------------------------------------------------------

produces a syntax error:

---- SHELL (path=session25/git/abc_step1,hide) ---------------------------------
make
--------------------------------------------------------------------------------

---- SHELL (path=session25/git/abc_step1) --------------------------------------
./xtest_abc < empty_expr_statement.abc
--------------------------------------------------------------------------------

In the grammer we just have to change the production for an expression
statement to

---- LATEX ---------------------------------------------------------------------
\begin{array}{rcl}
\text{expr-statement}
    & = &
    [ \text{assignment-expr} ]\;
    \texttt{";"}\;
    \\
\end{array}
--------------------------------------------------------------------------------

Hence, we have to update again the parser.

Updating the Parse Functions: The Obvious Approach
==================================================
In ``parseExprStatement`` we could first check if the current token is the
semicolon and in this case return after consuming the token:

---- CODE (type=c) -------------------------------------------------------------
static void
parseExprStatement(void)
{
    if (token.kind == SEMICOLON) {
	getToken();
	return;
    }
    /* rest as before */
}
--------------------------------------------------------------------------------

This will solve the problem

---- SHELL (path=session25/git/abc_step2,hide) ---------------------------------
make
echo ";" > empty_expr_statement.abc
--------------------------------------------------------------------------------

---- SHELL (path=session25/git/abc_step2) --------------------------------------
./xtest_abc < empty_expr_statement.abc
--------------------------------------------------------------------------------

However, it will later turn out to be more convenient to approach things in a
way that now seems to be unnecessarily complicated. To be clear: do not apply
the above change to ``parseExprStatement()``.


Updating the Parse Functions: What Helps Us Later
=================================================

It will be later more convenient if ``parseAssignmentExpr()`` returns a
null pointer if no expression was found. For now it means that we have to
struggle through a few changes. In short if a new expression node gets created
we have to check first if all pointers to child node are not null pointers.

Changes in ``parsePrimaryExpr()``
---------------------------------
Currently an assertion failure gets triggered if no primary expression could be
parsed. Remove the assertion and simply return a null pointer in this case.

Changes in ``parseUnaryExpr()``
-------------------------------
If an unary operator was found it has to be followed by a non-empty expression.
Otherwise it is a syntax error:

---- CODE (type=c) -------------------------------------------------------------
static const struct Expr *
parseUnaryExpr(void)
{
    if (token.kind == PLUS || token.kind == MINUS) {
	enum TokenKind op = token.kind;
	getToken();
	const struct Expr *expr = parseUnaryExpr();
	if (!expr) {
	    expectedError("non-empty expression");
	}
	if (op == MINUS) {
	    return newUnaryExpr(EK_UNARY_MINUS, expr);
	}
	return newUnaryExpr(EK_UNARY_PLUS, expr);
    }
    return parsePrimaryExpr();
}
--------------------------------------------------------------------------------

Changed in ``parseLeftAssocBinaryExpr()``
-----------------------------------------
If a binary operator was found it has to be followed by a non-empty expression.
Add a check similarly to the one in ``parseUnaryExpr()``.

Changes in ``parseAssignmentExpr()``
------------------------------------
If an assignment operator was found it has to be followed by a non-empty
expression.  Add a check similarly to the one in ``parseUnaryExpr()`` and
``parseLeftAssocBinaryExpr()``.

Changes in ``parseExprStatement(void)``
---------------------------------------
If the expression is empty a semicolon is expected. In this case it gets
consumed and the parse function returns:

---- CODE (type=c) -------------------------------------------------------------
static void
parseExprStatement(void)
{
    const struct Expr *expr = parseAssignmentExpr();
    if (expr) {
	const struct Expr *folded = constFoldExpr(expr);
    	GenReg dest = genGetReg();
	/* ... as before ... */
    }
    expected(SEMICOLON);
    getToken();
}
--------------------------------------------------------------------------------

Changes in ``parse()``
----------------------
We are currently parsing here our I/O hack. Both operators "_$>_" and "_$<_"
have to be followed by a non-empty assignment expression. Below it will be
suggested to outsource parsing of the I/O hack. But before another change gets
applied the current modifications should be tested.


Tests for the Parser Update
===========================
In the following cases the compiler should in some cases generate syntax errors
but never an assertion failure:

---- SHELL (path=session25/git/abc_step3,hide) ---------------------------------
make
--------------------------------------------------------------------------------

---- SHELL (path=session25/git/abc_step3) --------------------------------------
# this should give a syntax error
echo "$>;" | ./xtest_abc foo.s || echo "ok! Error was expected"
# this should give a syntax error
echo "$<;" | ./xtest_abc foo.s || echo "ok! Error was expected"
# this should give a syntax error
echo "3 +;" | ./xtest_abc foo.s || echo "ok! Error was expected"
# also this
echo "+;" | ./xtest_abc foo.s || echo "ok! Error was expected"
# this should be ok
echo ";" | ./xtest_abc foo.s || echo "Ups! This should compile"
--------------------------------------------------------------------------------

Some Cleanup: Grammar and Parse Functions for Statements
========================================================
So far our grammar was focused on expressions. We only had two kind of
statements: expression statements and the I/O hack. This will change soon when
we add statements for control structures. Let's prepare the grammar already for
that:

---- LATEX ---------------------------------------------------------------------
\begin{array}{rcl}
\text{input-sequence}
    & = &
    \{\;
    \text{statement}\;
    \}\;
    \\
\text{statement}
    & = &
    \text{io-hack-statement}\;
    \\
    & | &
    \text{expr-statement}\;
    \\
\text{io-hack-statement}
    & = &
    (\;
    $> \;
    | \;
    $< \;
    ) \;
    \text{assignment-expr}\;
    \texttt{;}\;
    \\
\text{expr-statement}
    & = &
    \text{assignment-expr}\;
    \texttt{;}\;
    \\
\end{array}
--------------------------------------------------------------------------------

It might actually be a good idea to have two source files for parse functions:
``parse_expr.c`` for parsing expressions and _parse_stmnt.c`` for parsing
statements. However, for now we keep everything in ``parse.c``. But
conceptually this separation will be reflected in how the code gets
reorganized.

Furthermore, parse functions for statements will have a boolean as return type.
If a parse function for a statement returns _false_ it means two things:

- no token was consumed and
- the current token does not initiate a corresponding statement.

If the parse function returns _true_ the corresponding statement could be
parsed and all its tokens are consumed when the function returns.

This is possible because from our grammar the kind of statement can be inferred
by the first token of the statement. For example, a _while-statement_ will begin
with a _while_ token, a _for-statements_ with a _for_ token, etc. 

Memory Management
-----------------
So far expressions only could occur within an expression statement (or an IO
hack). This will no longer be the case. Every statement can contain an
expression. Memory for expressions can be released after a statement was parsed
and code for it generated.

Function ``parse()`` and ``parseStatement()``
---------------------------------------------
Here all the forward declarations of parse functions and the "main" parse
function _parse()_. It parses a sequence of statements until the end of input
is reached. Memory for expressions can be released after a statement was parsed:

---- CODE (type=c) -------------------------------------------------------------
// for parsing statements
static bool parseStatement(void);
static bool parseIOHackStatement(void);
static bool parseExprStatement(void);

// for parsing expressions
static const struct Expr *parseAssignmentExpr(void);
static const struct Expr *parseLeftAssocBinaryExpr(int prec);
static const struct Expr *parseUnaryExpr(void);
static const struct Expr *parsePrimaryExpr(void);

void
parse(void)
{
    while (token.kind != EOI) {
        if (!parseStatement()) {
            expectedError("statement");
        }
	deleteAllExpr();
    }
}
--------------------------------------------------------------------------------

Function ``parseStatement()`` tries to parse any kind of statements and returns
_true_ if this succeeds and otherwise _false_. Note that it will not matter in
which order it tries to parse the different kinds of statements.

---- CODE (type=c) -------------------------------------------------------------
static bool
parseStatement(void)
{
    if (parseExprStatement()) {
        return true;
    } else if (parseIOHackStatement()) {
        return true;
    } else {
	return false;
    }
}
--------------------------------------------------------------------------------

Function ``parseExprStatement()``
---------------------------------
If no expression could be parsed and the current token is not a semicolon the
function returns _false_. The rest of the implementation is unchanged:

---- CODE (type=c) -------------------------------------------------------------
static bool
parseExprStatement(void)
{
    const struct Expr *expr = parseAssignmentExpr();
    if (expr) {
	/* ... almost as before: no longer call deleteAllExpr() here ... */
    } else if (token.kind != SEMICOLON) {
	return false;
    }
    expected(SEMICOLON);
    getToken();
    return true;
}
--------------------------------------------------------------------------------

Function ``parseIOHackStatement()``
-----------------------------------
If the current token is not a dollar sign it immediately returns _false_.
Otherwise it parses the IO hack:

---- CODE (type=c) -------------------------------------------------------------
static bool
parseIOHackStatement(void)
{
    if (token.kind != DOLLAR) {
	return false;
    }
    getToken();
    if (token.kind == GREATER) {
	getToken();
	// read unsigned integer
	/* ... as before ... */
    } else if (token.kind == LESS) {
	getToken();
	// print unsigned integer
	/* ... as before ... */
    } else {
	expectedError("'>' or '<'");
    }
    expected(SEMICOLON);
    getToken();
    return true;
}
--------------------------------------------------------------------------------

Exercise
========
Rearrange the parse functions as outlined above. Test your implementation for
example with

---- CODE (file=session25/git/abc_step4/test2.abc) ------------------------------
a + b * (c + d);
x + 1 == y;
x + 1 != y * z;
x + 1 <= y * z;
x + 1 >= y * z;
x + 1 < y * z;
x + 1 > y * z;
;
--------------------------------------------------------------------------------

After generating Latex code for the expression tree representations with

---- SHELL (path=session25/git/abc_step4/) -------------------------------------
./xtest_abc test2.s test2.tex < test2.abc
lualatex test2.tex > /dev/null
--------------------------------------------------------------------------------

---- SHELL (path=session25/git/abc_step4/, hide) -------------------------------
cp test2.pdf /home/www/htdocs/numerik/hpc/ss22/hpc0/session25/
--------------------------------------------------------------------------------

you should get this __pdf__. Note that the empty expression statement will not
show up in the document.

:links: pdf -> https://www.mathematik.uni-ulm.de/numerik/hpc/ss22/hpc0/session25/test2.pdf