Calculator with variables

Using the calculator

Here an example for using variables

1
2
3
a=2*(1+2)
b=a*4 + a
b*a

Writing this in a file example6 it gets evaluated line by line when you pass it a parameter to ulmcalc-step3.ast:

theon$ ulmcalc-step3.ast example6
> a = 2 * (1 + 2)
a = 6

> b = a * 4 + a
b = 30

> b * a
180
theon$ 

Alternatively you can run ulmcalc-step3.ast without using a file. In that case type in the statements line by line and terminate the input with CTRL-D.

Video tutorial

... here the changes in the lexer, parser and code generator:

  • scanner.cpp in /home/numerik/pub/ulm-calc/astl-calc-step3/,

  • parser.ypp in /home/numerik/pub/ulm-calc/astl-calc-step3/ and

  • most relevant eval_step3.ast in /home/numerik/pub/ulm-calc/lib/:

    Even without understanding the ASTL language developed by Andreas F. Borchert you see, it just describes these pre-order and post-order rules for certain node types that were mentioned in the video.

Lexical elements

The lexer no longer treats a newline as the end of input. Because we want to have multiple lines with expressions.

Tokens can be identifiers, decimal constants, operators and a eol (end of line) token:

  • Identifiers begin with a letter, i.e. 'A' to 'Z' and 'a' to 'z', an underscore '_', or a dot '.' and are optionally continued with a sequence of more letters, underscores, dots , or decimal digits '0' to '9'.

  • Decimal constants are described by the regular expression 0|[1-9][0-0]*

  • The characters '+', '-', '*', '/' and '%' are operators.

  • The newline '\n' generates a eol (end of line) token.

Spaces get consumed and newline is treated as the end of an input.

For example, this file

contains the following tokens:

theon$ ulmcalc-step3-lexer lex-example
DECIMAL_CONSTANT "11" at lex-example:1.1-2
IDENT "foo11" at lex-example:1.3-7
ASTERISK at lex-example:1.8
SLASH at lex-example:1.9
PERCENT at lex-example:1.10
MINUS at lex-example:1.11-12
ASSIGN at lex-example:1.13
PLUS at lex-example:1.14
IDENT "dummy1.a" at lex-example:1.15-22
PERCENT at lex-example:1.23
EOL at lex-example:1.24
theon$ 

Grammar

You can read the grammar as follows:

  • The calculator accepts a list of statements. Each of this statements has to end with a newline (eol).

    \[\begin{array}{lcl}\langle\text{calc}\rangle & \to & \langle\text{statement-list}\rangle \\\langle\text{statement-list}\rangle & \to & \langle\text{statement}\rangle \quad \text{eol}\\ & \to & \langle\text{statement-list}\rangle \quad \langle\text{statement}\rangle\quad \text{eol}\\\end{array}\]
  • Statements can be expressions or assignments.

    \[\begin{array}{lcl} \langle\text{statement}\rangle & \to &\langle\text{expression}\rangle \\ & \to & \langle\text{assignment}\rangle \\\langle\text{expression}\rangle & \to & \langle\text{exp}\rangle \\\langle\text{assignment}\rangle & \to & \langle\text{identifier}\rangle\quad\text{"="}\quad \langle\text{exp}\rangle \\ \end{array}\]
  • And basically the grammar for \(\langle\text{exp}\rangle\) was just extended for using \(\langle\text{identifiers}\rangle\) like an \(\langle\text{integer}\rangle\):

    \[\begin{array}{lcl}\langle\text{exp}\rangle & \to & \langle\text{term}\rangle \\ & \to & \langle\text{exp}\rangle \quad \text{'+'} \quad \langle\text{term}\rangle \\ & \to & \langle\text{exp}\rangle \quad \text{'-'} \quad \langle\text{term}\rangle \\\langle\text{term}\rangle & \to & \langle\text{factor}\rangle \\ & \to & \langle\text{term}\rangle \quad\text{'*'}\quad \langle\text{factor}\rangle \\ & \to & \langle\text{term}\rangle \quad\text{'/'}\quad \langle\text{factor}\rangle \\ & \to & \langle\text{term}\rangle \quad\text{'%'}\quad \langle\text{factor}\rangle \\\langle\text{factor}\rangle & \to & \langle\text{primary}\rangle \\ & \to & \langle\text{unary-minus}\rangle \\\langle\text{unary-minus}\rangle & \to & \text{'-'} \quad \langle\text{primary}\rangle \\\langle\text{pimary}\rangle & \to & \langle\text{integer}\rangle \\ & \to & \langle\text{identifier}\rangle \\ & \to & \text{'('} \quad \langle\text{exp}\rangle \quad \text{')'}\\\langle\text{integer}\rangle & \to & \text{decimal-constant} \\\langle\text{identifier}\rangle & \to & \text{ident} \\\end{array}\]

For the initial example on this page you get the following syntax tree

Semantic error detected in the code generator

We now have finally the possibility to construct an example where an error can not be detected before the code generation:

We are using an undefined variable and get this semantic error:

theon$ ulmcalc-step3.ast example7
> b = a + 1
unknown identifier a
theon$ 

This gets detected when the code generator tries to evaluate the tree generated by the parser:

More about lexer, parser and code generator

If you look at the bottom of this page you find this link document source. That is the source code from which this website was generated using doctool. Like the ulmcalc, ulmas etc. it has a lexer, parser and code generator.

And also, whenever you type some commands in the shell like

theon$ ls -l
total 113
drwxrwsr-x   2 lehn     num           12 Jul  3 19:45 db
-rw-rw-r--   1 lehn     num          187 Apr  8 08:33 doctool.postconfig
-rw-rw-r--   1 lehn     num          356 Apr  8 08:28 doctool.preconfig
-rw-rw-r--   1 lehn     num         4948 May 12 07:46 index.doc
-rw-r--r--   1 lehn     num         2556 Apr  8 08:29 IndexHeader
-rwxrwxr-x   1 lehn     num          325 Jun  5 10:14 readdb.pm
drwxrwsr-x   3 lehn     num            8 Apr 15 11:25 session00
drwxrwsr-x   2 lehn     num            6 Apr 20 18:48 session01
drwxr-sr-x   2 lehn     num            6 Apr 21 18:20 session02
drwxr-sr-x   2 lehn     num            6 Jun  5 20:45 session03
drwxr-sr-x   2 lehn     num            6 Jun  5 20:49 session04
drwxr-sr-x   2 lehn     num            5 Jun  5 20:51 session05
drwxr-sr-x   2 lehn     num            8 Jun  5 20:53 session06
drwxrwsr-x   3 lehn     num            7 Jun  7 10:06 session07
drwxr-sr-x  16 lehn     num           23 Jun 22 08:30 session08
drwxr-sr-x   5 lehn     num            9 Jul  1 10:54 session09
drwxr-sr-x  12 lehn     num           25 Jun 26 09:17 session10
drwxr-sr-x  15 lehn     num           20 Jul  3 19:40 session11
-rw-r--r--   1 lehn     num         3736 Jun  4 13:02 SimpleHeader
-rw-rw-r--   1 lehn     num         4276 May 29 18:07 syllabus.doc
drwxrwsr-x   2 lehn     num           28 Jul  2 21:16 tex
drwxrwsr-x  14 lehn     num           16 Jul  3 19:45 tmp
-rwxrwxr-x   1 lehn     num          289 Jun  5 10:15 writedb.pm
theon$ 

the same components are in play. Tokens are ls and -l. The syntax tree will have a root node for ls with a child node -l. The code generator uses the root node as command and the sub nodes as arguments to this command. So you can say that the generated code is the execution of the syntax tree.

So what kind of error do you get from

theon$ ls-l
bash: ls-l: command not found
theon$ 

or

theon$ cd..
bash: cd..: command not found
theon$ 

Both, the ls-l and cd.. will generate just one token. The syntax tree will just contain one node ls-l and cd.. respectively. Running this gives an error. So this is a semantic error.