Let's start with some simple expressions

In a first step we talk about some simple calculator that can add and subtract integers. So it should be able to compute such an expression:

The calculator will do this in three steps. It first breaks the string that is supposed to be an expression into tokens. From this tokens it generates (if it conforms to some specified grammar) a syntax tree. The result then gets evaluated by traversing the tree.

What kind of expressions the calculator can handle depends on different things:

  • What numeral representations do we allow? Decimal, hexadecimal, octal?

  • What kind of operations do we allow?

In this first and simple version we only allow numerals that are decimal and (as mentioned above) only support '+' and '-' operations. Now two things are of interest here:

  • How are these features described in some formal way (and implemented) and

  • what kind of errors can occur if someone uses this calculator.

At the end we also want to talk about how to improve this calculator in a next step.

Using the calculator

You can use the calculator interactive (like shown in the video) or you can write the expression into a file. On this page you will see examples for a file example1 with the following content:

theon$ cat example1
14 +5- 2
theon$ 

When using a file make sure the expression is terminated by a newline.

Video tutorial

Source files shown in this video are

Lexical elements

The input is converted into a sequence of tokens during the lexical analysis. Tokens can be operators or decimal constants. Tokens end if the next character is a space or if the next character cannot be added to it:

  • Decimal constants are represented either by '0' alone or begin with a decimal digit '1' to '9' optionally followed by more digits from '0' to '9'.

  • The characters '+' and '-' are operators.

Tokens end if the next character is a space or if the next character cannot be added to it.

Newline is treated as “end of input”.

The above example will generate the following tokens:

theon$ ulmcalc-step1-lexer example1
DECIMAL_CONSTANT "14" at example1:1.1-2
PLUS at example1:1.3-4
DECIMAL_CONSTANT "5" at example1:1.5
MINUS at example1:1.6
DECIMAL_CONSTANT "2" at example1:1.7-8
theon$ 

Grammar

The grammar for the expressions that we allow can be described by the following production rules:

\[\begin{array}{lcl}\langle\text{expression}\rangle & \to & \langle\text{exp}\rangle \\ \langle\text{exp}\rangle & \to & \langle\text{term}\rangle \\ & \to & \langle\text{exp}\rangle \quad \text{'+'} \quad \langle\text{term}\rangle \\ & \to & \langle\text{exp}\rangle \quad \text{'-'} \quad \langle\text{term}\rangle \\ \langle\text{term}\rangle & \to & \langle\text{integer}\rangle \\ \langle\text{integer}\rangle & \to & \text{decimal-constant} \\ \end{array}\]

Here, the (terminal) symbols '+', '-' and decimal-constant refer to tokens.

Now we can run our example from above first through the lexer and then through the parser. Using ulmcalc-step1-tree.ast you can do that as follows and will see some textual representation of the tree:

theon$ ulmcalc-step1-tree.ast example1
("expression"
   ("-"
      ("+"
         ("integer"
            14
         )
         ("integer"
            5
         )
      )
      ("integer"
         2
      )
   )
)
theon$ ulmcalc-step1-jstree.ast -o tree-example1.html tree-example1 example1
theon$ 

By using ulmcalc-step1-jstree.ast like this

theon$ ulmcalc-step1-jstree.ast -o tree-example1.html tree-example1 example1
theon$ 

the following SVG was generated:

Generate the result of the expression

Having such a tree representation it is easy to evaluate the expression by traversing the tree in post oder. If a node is a leaf node (i.e. not having any children) then it is decimal constant and that specifies its value. If it is not a leaf node (i.e. it has some children) then depending on its type (plus or minus) compute its value by either taking the sum or the difference of its child nodes.

And that it done by ulmcalc-step1.ast:

theon$ ulmcalc-step1.ast example1
17
theon$ 

So what are the limitations that we want to overcome next?

Certainly we also want to do multiplications! And allowing octal and hexadecimal numerals would be nice too!