========================================= Recursive Descent Parser for a Calculator [TOC] ========================================= ---- VIDEO ------------------------------ https://www.youtube.com/embed/OWp3zPiCcAE ----------------------------------------- In this video a predictive __recursive descent parser__ was implemented. This is only possible for so called __LL(k)__ grammars. So it is not possible to parse all context free grammars with this type of a parser. Furthermore, the methodic shown for translating an __EBNF__ grammar into a parser implementation even requires a _LL(1)_ grammar. So with that we are limited to an even smaller subset. Fortunately most programming languages of practical interest can be parsed with a recursive descent parser. :links: recursive descent parser -> https://en.wikipedia.org/wiki/Recursive_descent_parser EBNF -> https://en.wikipedia.org/wiki/Extended_Backus–Naur_form LL\(k\) -> https://en.wikipedia.org/wiki/LL_parser Current Grammar for the Calculator ================================== Below you find the grammar of the calculator developed in the video. Note that the production rule for $\langle\text{translation-unit}\rangle$ gets kind of canceled out when it gets rewritten into EBNF. ---- LATEX --------------------------------------------------------------------- \begin{array}{rcl} \langle\text{translation-unit}\rangle & \to & \texttt{EOI} \\ & \to & \langle\text{input-sequence}\rangle \; \texttt{EOI} \\ \langle\text{input-sequence}\rangle & \to & \langle\text{expr-statement}\rangle\; \\ & \to & \langle\text{input-sequence}\rangle\; \langle\text{expr-statement}\rangle \\ \langle\text{expr-statement}\rangle & \to & \langle\text{expr}\rangle\; \texttt{';'}\; \\ \langle\text{expr}\rangle & \to & \langle\text{term}\rangle\; \\ & \to & \langle\text{expr}\rangle\; \texttt{'+'}\; \langle\text{term}\rangle\; \\ & \to & \langle\text{expr}\rangle\; \texttt{'-'}\; \langle\text{term}\rangle\; \\ \langle\text{term}\rangle & \to & \langle\text{factor}\rangle\; \\ & \to & \langle\text{term}\rangle\; \texttt{'*'}\; \langle\text{factor}\rangle\; \\ & \to & \langle\text{term}\rangle\; \texttt{'/'}\; \langle\text{factor}\rangle\; \\ \langle\text{factor}\rangle & \to & \langle\text{dec-literal}\rangle\; \\ & \to & \langle\text{hex-literal}\rangle\; \\ & \to & \langle\text{oct-literal}\rangle\; \\ & \to & \texttt{'('}\; \langle\text{expr}\rangle\; \texttt{')'}\; \\ \end{array} -------------------------------------------------------------------------------- The grammar can be rewritten into EBNF notation as follows: ---- LATEX --------------------------------------------------------------------- \begin{array}{rcl} \text{translation-unit} & = & [\; \text{input-sequence} ] \; \texttt{EOI} \\ \text{input-sequence} & = & \{\; \text{expr-statement}\; \} \\ \text{expr-statement} & = & \text{expr}\; \texttt{;}\; \\ \text{expr} & = & \text{term}\; \{\; (\; \texttt{"+"}\; |\; \texttt{"-"}\; )\; \text{term} \} \\ \text{term} & = & \text{factor}\; \{\; (\; \texttt{"*"}\; |\; \texttt{"/"}\; )\; \text{factor} \} \\ \text{factor} & = & \text{dec-literal}\; \\ & | & \text{hex-literal}\; \\ & | & \text{oct-literal}\; \\ & | & \texttt{"("}\; \text{expr}\; \texttt{")"}\; \\ \end{array} -------------------------------------------------------------------------------- In EBNF you see that the rule for $\text{translation-unit}$ can be fused with the rule for $\text{input-sequence}$ to ---- LATEX --------------------------------------------------------------------- \text{translation-unit} = [\;\{\; \text{expr-statement}\;\} ] \; \texttt{EOI} -------------------------------------------------------------------------------- Obviously an optional repetition is just a repetition. So this can be simplified to ---- LATEX --------------------------------------------------------------------- \text{translation-unit} = \{\; \text{expr-statement}\;\} \texttt{EOI} -------------------------------------------------------------------------------- And with renaming it would be ---- LATEX --------------------------------------------------------------------- \text{input-sequence} = [\;\{\; \text{expr-statement}\;\} ] \; \texttt{EOI} -------------------------------------------------------------------------------- So we have: ---- LATEX --------------------------------------------------------------------- \begin{array}{rcl} \text{input-sequence} & = & \{\; \text{expr-statement}\; \}\; \texttt{EOI} \\ \text{expr-statement} & = & \text{expr}\; \texttt{;}\; \\ \text{expr} & = & \text{term}\; \{\; (\; \texttt{"+"}\; |\; \texttt{"-"}\; )\; \text{term} \} \\ \text{term} & = & \text{factor}\; \{\; (\; \texttt{"*"}\; |\; \texttt{"/"}\; )\; \text{factor} \} \\ \text{factor} & = & \text{dec-literal}\; \\ & | & \text{hex-literal}\; \\ & | & \text{oct-literal}\; \\ & | & \texttt{"("}\; \text{expr}\; \texttt{")"}\; \\ \end{array} -------------------------------------------------------------------------------- The parse function _parseInputSequence()_ can be implemented with a simple while loop. It checks in its condition whether the current token is not _EOI_ and parses in its loop body an expression statement. Provided Material ================= ---- SHELL (path=session17/git, hide) ------------------------------------------ rm -rf abc git clone git@gitlab.com:uni-ulmulm-university-department-of-numerical-analysis/abc.git git config --global advice.detachedHead false -------------------------------------------------------------------------------- ---- SHELL (path=session17/git/abc, hide) -------------------------------------- git checkout tags/calc-quiz make touch xtest_calc.c -------------------------------------------------------------------------------- Here the source file ~xtest_calc.c~ developed in the video: :import: session17/git/abc/xtest_calc.c [fold] If you have __fixed the makefile__ it should compile with a simple _make_ like that: ---- SHELL (path=session17/git/abc) -------------------------------------------- make echo "12 * (3 + 4);" | ./xtest_calc -------------------------------------------------------------------------------- To get rid of the debug messages compile with _CPPFLAGS=-DNDEBUG_: ---- SHELL (path=session17/git/abc) -------------------------------------------- touch xtest_calc.c make CPPFLAGS=-DNDEBUG echo "12 * (3 + 4);" | ./xtest_calc -------------------------------------------------------------------------------- :links: fixed the makefile -> doc:session17/page01 The calculator is part of ABC compiler project. Here the headers and source files that we currently have: :import: session17/git/abc/lexer.h [fold] :import: session17/git/abc/lexer.c [fold] :import: session17/git/abc/tokenkind.c [fold] :import: session17/git/abc/tokenkind.h [fold] The string class is not the solution for the previous quiz. It still uses a buffer with fixed size. If you have already solved the quiz then you of course can and should use your solution instead: :import: session17/git/abc/str.h [fold] :import: session17/git/abc/str.c [fold] Quiz 18: Extend the Calculator ============================== Extend the calculator so that it supports: - Unary plus and minus expressions. An unary minus negates a value, and a unary plus has no effect. For example: - `---5` should be equal to `-5`, - `+-+-5` should equal `5`, - `6 + -5` should equal `6 - 5`. - An operator '_^_' for computing the power. For example: - `5^3` should compute `5 * 5 * 5`, - `5^-3` should compute `1 / (5 * 5 * 5)`. The calculator currently uses unsigned integer arithmetic. Change it to double precision, i.e. use the type _double_ for the value of an expression node. For computing the power you can use the function __pow()__ defined in __. This will on some platform you explicitly have to link against the C math library. You either can use the option _LDFLAGS=-lm_ when you run make or add ---- CODE (type=mk) ------------------------------------------------------------ LDFLAGS += -lm # extend value of LDFLAGS for '-lm' -------------------------------------------------------------------------------- as first line to your makefile. Adapt the lexer so that it can detect '_^_' (caret) as token. :links: pow\(\) -> https://linux.die.net/man/3/pow Grammar in EBNF --------------- Use the following grammar in EBNF for the extension of the calculator: ---- LATEX --------------------------------------------------------------------- \begin{array}{rcl} \text{input-sequence} & = & \{\; \text{expr-statement}\; \}\; \texttt{EOI} \\ \text{expr-statement} & = & \text{expr}\; \texttt{;}\; \\ \text{expr} & = & \text{term}\; \{\; (\; \texttt{"+"}\; |\; \texttt{"-"}\; )\; \text{term} \} \\ \text{term} & = & \text{power-expr}\; \{\; (\; \texttt{"*"}\; |\; \texttt{"/"}\; )\; \text{power-expr} \} \\ \text{power-expr} & = & \text{unary-expr}\; [\; \texttt{"^"}\; \text{power-expr} ]\; \\ \text{unary-expr} & = & \text{factor}\; |\; (\; \texttt{"+"}\; |\; \texttt{"-"}\;)\; \text{unary-expr} \\ \text{factor} & = & \text{dec-literal}\; \\ & | & \text{hex-literal}\; \\ & | & \text{oct-literal}\; \\ & | & \texttt{"("}\; \text{expr}\; \texttt{")"}\; \\ \end{array} -------------------------------------------------------------------------------- Example Output -------------- ---- SHELL (path=session17/sol, hide) ------------------------------------------ rm -rf abc git clone git@gitlab.com:uni-ulmulm-university-department-of-numerical-analysis/abc.git git config --global advice.detachedHead false -------------------------------------------------------------------------------- ---- SHELL (path=session17/sol/abc, hide) -------------------------------------- git checkout tags/calc-quiz-sol make touch xtest_calc.c -------------------------------------------------------------------------------- Here a file with some expression statements for testing the calculator: :import: session17/sol/abc/test_calc.in And here what the calculator computes from that: ---- SHELL (path=session17/sol/abc) -------------------------------------------- make CPPFLAGS=-DNDEBUG LDFLAGS=-lm xtest_calc ./xtest_calc < test_calc.in -------------------------------------------------------------------------------- Note that the grammar specifies that for example: - the expression "_-5^2_" should be evaluated like "_(-5)^2_", - the expression "_3-5^2_" should be evaluated like "_3-5^2_", - the expression "_3--5^2_" should be evaluated like "_3-(-5)^2_". Test such cases. ---- SHELL (path=session17/sol, hide) ------------------------------------------ rm -rf abc -------------------------------------------------------------------------------- Submission ---------- Submit your improved calculator implementation as follows: ---- CODE (type=sh) -------------------------------------------------------------- submit hpc quiz18 xtest_calc.c str.h str.c lexer.h lexer.c tokenkind.h tokenkind.c ----------------------------------------------------------------------------------