=============
Some I/O Hack								[TOC]
=============

---- SHELL (path=session23/git, hide) ------------------------------------------
rm -rf ulm-generator
git clone git@github.com:michael-lehn/ulm-generator.git
rm -rf ulm-generator/0_ulm_variants/my_isa
rm -rf ulm-generator/0_ulm_variants/ulm
mkdir ulm-generator/0_ulm_variants/stack
touch ulm-generator/0_ulm_variants/stack/variant.mk
cp ../isa.txt ulm-generator/0_ulm_variants/stack/
--------------------------------------------------------------------------------

---- SHELL (path=session23/git/ulm-generator, hide) -----------------------------
make
--------------------------------------------------------------------------------

---- SHELL (path=session23/git, hide) ------------------------------------------
rm -rf abc
git clone git@gitlab.com:uni-ulmulm-university-department-of-numerical-analysis/abc.git
git config --global advice.detachedHead false
--------------------------------------------------------------------------------

---- SHELL (path=session23/git/abc, hide) --------------------------------------
git checkout iohack-constfold
--------------------------------------------------------------------------------


The right way to support I/O functionality would require function calls. In the
meantime we provide some limited I/O support through two operatos. The input
operator '``$>``' can be used to read-in an unsigned integer into a l-value
expression.  And the output operator '``$<``' can be used to print an
assignment expression.

In the grammar we integrate this hack such that it easily can be removed once
we have proper function calls:

---- LATEX ---------------------------------------------------------------------
\begin{array}{rcl}
\text{input-sequence}
    & = &
    \{\;
    $> \;
    \text{assignment-expr}\;
    \texttt{;}\;
    |\;
    $< \;
    \text{assignment-expr}\;
    \texttt{;}\;
    |\;
    \text{expr-statement}\;
    \}\;
    \\
\text{expr-statement}
    & = &
    \text{assignment-expr}\;
    \texttt{;}\;
    \\
\text{assignment-expr}
    & = &
    \text{expr}\;
    [\;
    \texttt{=}\;
    \text{assignment-expr}\;
    ]\;
    \\
\text{expr}
    & = &
    \text{term}\;
    \{\;
      (\;
      \texttt{"+"}\; |\; \texttt{"-"}\;
      )\;
      \text{term}
    \}
    \\
\text{term}
    & = &
    \text{unary-expr}\;
    \{\;
      (\;
      \texttt{"*"}\; |\; \texttt{"/"}\; |\;  \texttt{"%"}\;
      )\;
      \text{unary-expr}
    \}
    \\
\text{unary-expr}
    & = &
    \text{factor}\;
    |\;
    (\; \texttt{"+"}\; |\; \texttt{"-"}\;)\;
    \text{unary-expr}
    \\
\text{factor}
    & = &
    \text{identifier}\;
    \\
    & | &
    \text{dec-literal}\;
    \\
    & | &
    \text{hex-literal}\;
    \\
    & | &
    \text{oct-literal}\;
    \\
    & | &
    \texttt{"("}\;
    \text{assignment-expr}\;
    \texttt{")"}\;
    \\
\end{array}
--------------------------------------------------------------------------------

Assembly Functions for I/O
==========================
Below you find assembly code for funtion ``get_uint64()`` that reads-in and
returns an unsigned value:

:import: session23/git/abc/examples/getuint64.s [fold]

And function ``print_uint64()`` that prints an unsigned integer passed as
parameter:

:import: session23/git/abc/examples/printuint64.s [fold]

Of course this code needs to be "available" for the assembly code that our
compiler produces. We have several choices:

- The complete source code of both function could be generated by
  ``genFooter()``.
- We can generate object files ``getuint64.o`` and printuint64.o``. The object
  files generated from our assembler output can be linked against these.
- The ULM assembler can process a sequence of assembly files and treats this
  input sequence as-if it was one large source file.
  
For now the last option listed above is the simplest. If the compiler generated
``foo.s`` then we invoke the assembler with

---- CODE (type=s) -------------------------------------------------------------
path-to-ulm/ulmas foo.s getuint64.s printuint64.s
--------------------------------------------------------------------------------

In a subsequent section you will find a makefile that can be used in a separate
directory for testing our compiler. In the makefile we can hide such nasty
details.


Changes for the Lexer
=====================
We at least need three more token kinds for '`$`', '`<`' and '`>`'. But now is
a good opportunity to added even a few more:

    +---------------------------+-----------------------+
    | end of input		|   `EOI`		|
    +---------------------------+-----------------------+
    | bad token			|   `BAD_TOKEN`		|
    +---------------------------+-----------------------+
    | `[1-9][0-9]*`		|   `DEC_LITERAL`	|
    +---------------------------+-----------------------+
    | `0x[0-9a-fA-F]+`		|   `HEX_LITERAL`	|
    +---------------------------+-----------------------+
    | `0[0-7]*`			|   `OCT_LITERAL`	|
    +---------------------------+-----------------------+
    | `&`			|   `AMPERSAND`		|
    +---------------------------+-----------------------+
    | `&&`   			|   `AMPERSAND2`	|
    +---------------------------+-----------------------+
    | `*`			|   `ASTERISK`		|
    +---------------------------+-----------------------+
    | `^`			|   `CARET`		|
    +---------------------------+-----------------------+
    | `$`			|   `DOLLAR`		|
    +---------------------------+-----------------------+
    | `=` 			|   `EQUAL`		|
    +---------------------------+-----------------------+
    | `==`			|   `EQUAL2`		|
    +---------------------------+-----------------------+
    | `!`			|   `NOT`		|
    +---------------------------+-----------------------+
    | `!=`			|   `NOT_EQUAL`		|
    +---------------------------+-----------------------+
    | `>`			|   `GREATER`		|
    +---------------------------+-----------------------+
    | `>=`			|   `GREATER_EQUAL`	|
    +---------------------------+-----------------------+
    | `<`			|   `LESS`		|
    +---------------------------+-----------------------+
    | `<=`			|   `LESS_EQUAL`	|
    +---------------------------+-----------------------+
    | `(`			|   `LPAREN`		|
    +---------------------------+-----------------------+
    | `-`			|   `MINUS`		|
    +---------------------------+-----------------------+
    | `%`			|   `PERCENT`		|
    +---------------------------+-----------------------+
    | `+`			|   `PLUS`		|
    +---------------------------+-----------------------+
    | `)`			|   `RPAREN`		|
    +---------------------------+-----------------------+
    | `;`			|   `SEMICOLON`		|
    +---------------------------+-----------------------+
    | `/`			|   `SLASH`		|
    +---------------------------+-----------------------+
    | `~`			|   `TILDE`		|
    +---------------------------+-----------------------+
    | `|`			|   `VBAR`		|
    +---------------------------+-----------------------+
    | `||`			|   `VBAR2`		|
    +---------------------------+-----------------------+
    | `[a-zA-Z_][a-zA-Z_0-9]*`	|   IDENTIFIER		|
    +---------------------------+-----------------------+

Here the applied changes to _tokenkind.txt_ and _lexer.c_:

:import: session23/git/abc/tokenkind.txt [fold]
:import: session23/git/abc/lexer.c [fold]

The test for the lexer was changed so that a bad token triggers an assertion
failure. Here a minimalistic test run for the lexer:

---- SHELL (path=session23/git/abc,fold) ---------------------------------------
make xtest_lexer
cat test_lexer.in | ./xtest_lexer
--------------------------------------------------------------------------------

The test input for the lexer was changed to

:import: session23/git/abc/test_lexer.in [fold]


Changes in the Code Generation Interface
========================================
The code generation interface now declares four more functions:

---- CODE (type=c) -------------------------------------------------------------
#ifndef ABC_GEN_H
#define ABC_GEN_H

// ...

// fetch / store quad word (8 bytes)
void genFetch(GenReg addr, GenReg dest);
void genFetchDispl(int64_t displ, GenReg addr, GenReg dest); // new
void genStore(GenReg src, GenReg addr);
void genStoreDispl(GenReg src, int64_t displ, GenReg addr); // new

// ...

// IO hack
void genOutHack(GenReg src); // new
void genInHack(GenReg dest); // new

#endif // ABC_GEN_H
--------------------------------------------------------------------------------

Functions _genFetchDispl()_ and _genStoreDispl()_ can be used for fetch and
store instructions where a displacement is needed. For example,
_genFetchDispl(16, 8, 9)_ generates

---- CODE (type=s) -------------------------------------------------------------
    movq 16(%8), %9
--------------------------------------------------------------------------------

If the displacement can not be encoded code get generates to compute the
displaced address and for fetching data from there. For example,
_genFetchDispl(256, 8, 9)_ might acquire %6 as temporary register and generate

---- CODE (type=s) -------------------------------------------------------------
	ldzwq 0x100, %6
	addq %6, %8, %6
	movq (%6), %9
--------------------------------------------------------------------------------

Function _genHackOut()_ generates code to call a function for printing the
unsigned integer in the register specified by _src_.

Function _genInHack()_ generates code to call a function for reading-in an
unsigned integer and for storing it in the register specified by _dest_.

Here the complete header and source file:

:import: session23/git/abc/gen.h [fold]
:import: session23/git/abc/gen.c [fold]


Changes in the Parser
=====================
In the parser only function _parse()_ needs to be changed:

---- CODE (type=c) -------------------------------------------------------------
void
parse(void)
{
    while (token.kind != EOI) {
	if (token.kind == DOLLAR) {
	    getToken();
	    if (token.kind == GREATER) {
		getToken();
		// read unsigned integer
		GenReg dest = genGetReg(), val = genGetReg();
		struct TokenPos pos = token.pos;
		const struct Expr *expr = parseExpr();
		if (!isLValueExpr(expr)) {
		    errorAtPos(pos, "L-value expected");
		}
		genInHack(val);
		loadExprAddr(expr, dest);
		genStore(val, dest);
		genUngetReg(dest);
		genUngetReg(val);
	    } else if (token.kind == LESS) {
		getToken();
		// print unsigned integer
		GenReg src = genGetReg();
		const struct Expr *expr = parseAssignmentExpr();
		loadExpr(expr, src);
		genOutHack(src);
		genUngetReg(src);
	    } else {
		expectedError("'>' or '<'");
	    }
	    expected(SEMICOLON);
	    getToken();
	} else {
	    parseExprStatement();
	}
    }
}

--------------------------------------------------------------------------------

Testing the Compiler
====================

Run ``make xtest_abc`` to build the compiler:

---- SHELL (path=session23/git/abc, fold) --------------------------------------
make xtest_abc
--------------------------------------------------------------------------------

Now a simple program like

---- CODE (file=session23/git/abc/print_42.abc) --------------------------------
x = 21;
x = x * 2;
$< x;
--------------------------------------------------------------------------------

can be used to print out the value of a variable. With the compiler we first
generate the assembly code. Currently the compiler reads the source code from
_stdin_ so we have to redirect the source file with '``<``':

---- SHELL (path=session23/git/abc, fold) --------------------------------------
./xtest_abc print_42.s < print_42.abc
--------------------------------------------------------------------------------

Here the generated assembly code:

:import: session23/git/abc/print_42.s [fold]

Next we can use the ULM assembler to generate an executable. In addition to the
generated assembly code ``print_42.s`` we also pass the source files
``getuint64.s`` and  ``printuint64.s``:

---- SHELL (path=session23/git/abc, hide) --------------------------------------
cp examples/*uint*.s .
--------------------------------------------------------------------------------

---- SHELL (path=session23/git/abc, fold) --------------------------------------
../ulm-generator/1_ulm_build/stack/ulmas print_42.s getuint64.s printuint64.s
--------------------------------------------------------------------------------

This produced _a.out_

:import: session23/git/abc/a.out [fold]

Which now can be executed:

---- SHELL (path=session23/git/abc, fold) --------------------------------------
../ulm-generator/1_ulm_build/stack/ulm a.out
--------------------------------------------------------------------------------

If you find it inconvenient to specify the path to ULM for executing the
program you can turn it into an executable shell script. The first line of an
executable shell script contains after the __Shebang__ the path to a program
which receives the rest of the file as argument. So in the case above the first
line should be

---- CODE (type=sh) ------------------------------------------------------------
#! ../ulm-generator/1_ulm_build/stack/ulm
--------------------------------------------------------------------------------

With the two Unix commands

---- SHELL (path=session23/git/abc, fold) --------------------------------------
(echo '#! ../ulm-generator/1_ulm_build/stack/ulm'; cat a.out) > print_42
chmod +x print_42
--------------------------------------------------------------------------------

a new file ``print_42`` was created with the Shebang for the ULM in the first
line followed by the content of _a.out_.

:import: session23/git/abc/print_42 [fold]

With _chmod_ this file was made executable:

---- SHELL (path=session23/git/abc, fold) --------------------------------------
./print_42
--------------------------------------------------------------------------------

Of course the creation of an executable can be automatized with a makefile.

:links: Shebang -> https://en.wikipedia.org/wiki/Shebang_(Unix)

Example Directory and Makefile
==============================
Create a subdirectory and save there following files (make sure the makefile
has poper tabs):

:import: session23/git/abc/examples/Makefile [fold]
:import: session23/git/abc/examples/getuint64.s [fold]
:import: session23/git/abc/examples/printuint64.s [fold]
:import: session23/git/abc/examples/test.abc [fold]

Create an addition file "path-to-ulm" that contains the path to your ULM
assembler and the virtual machine. For example

---- CODE (file=session23/git/abc/examples/path-to-ulm) ------------------------
$HOME/tmp/ulm-generator/1_ulm_build/stack/
--------------------------------------------------------------------------------

or 

---- CODE (file=session23/git/abc/examples/path-to-ulm) ------------------------
../../ulm-generator/1_ulm_build/stack/
--------------------------------------------------------------------------------

Then with ``make`` all files with the extension '``.abc``' will be translated
into executables:

---- SHELL (path=session23/git/abc/examples) -----------------------------------
make
echo 4 | ./test
--------------------------------------------------------------------------------