============= Some I/O Hack [TOC] ============= ---- SHELL (path=session23/git, hide) ------------------------------------------ rm -rf ulm-generator git clone git@github.com:michael-lehn/ulm-generator.git rm -rf ulm-generator/0_ulm_variants/my_isa rm -rf ulm-generator/0_ulm_variants/ulm mkdir ulm-generator/0_ulm_variants/stack touch ulm-generator/0_ulm_variants/stack/variant.mk cp ../isa.txt ulm-generator/0_ulm_variants/stack/ -------------------------------------------------------------------------------- ---- SHELL (path=session23/git/ulm-generator, hide) ----------------------------- make -------------------------------------------------------------------------------- ---- SHELL (path=session23/git, hide) ------------------------------------------ rm -rf abc git clone git@gitlab.com:uni-ulmulm-university-department-of-numerical-analysis/abc.git git config --global advice.detachedHead false -------------------------------------------------------------------------------- ---- SHELL (path=session23/git/abc, hide) -------------------------------------- git checkout iohack-constfold -------------------------------------------------------------------------------- The right way to support I/O functionality would require function calls. In the meantime we provide some limited I/O support through two operatos. The input operator '``$>``' can be used to read-in an unsigned integer into a l-value expression. And the output operator '``$<``' can be used to print an assignment expression. In the grammar we integrate this hack such that it easily can be removed once we have proper function calls: ---- LATEX --------------------------------------------------------------------- \begin{array}{rcl} \text{input-sequence} & = & \{\; $> \; \text{assignment-expr}\; \texttt{;}\; |\; $< \; \text{assignment-expr}\; \texttt{;}\; |\; \text{expr-statement}\; \}\; \\ \text{expr-statement} & = & \text{assignment-expr}\; \texttt{;}\; \\ \text{assignment-expr} & = & \text{expr}\; [\; \texttt{=}\; \text{assignment-expr}\; ]\; \\ \text{expr} & = & \text{term}\; \{\; (\; \texttt{"+"}\; |\; \texttt{"-"}\; )\; \text{term} \} \\ \text{term} & = & \text{unary-expr}\; \{\; (\; \texttt{"*"}\; |\; \texttt{"/"}\; |\; \texttt{"%"}\; )\; \text{unary-expr} \} \\ \text{unary-expr} & = & \text{factor}\; |\; (\; \texttt{"+"}\; |\; \texttt{"-"}\;)\; \text{unary-expr} \\ \text{factor} & = & \text{identifier}\; \\ & | & \text{dec-literal}\; \\ & | & \text{hex-literal}\; \\ & | & \text{oct-literal}\; \\ & | & \texttt{"("}\; \text{assignment-expr}\; \texttt{")"}\; \\ \end{array} -------------------------------------------------------------------------------- Assembly Functions for I/O ========================== Below you find assembly code for funtion ``get_uint64()`` that reads-in and returns an unsigned value: :import: session23/git/abc/examples/getuint64.s [fold] And function ``print_uint64()`` that prints an unsigned integer passed as parameter: :import: session23/git/abc/examples/printuint64.s [fold] Of course this code needs to be "available" for the assembly code that our compiler produces. We have several choices: - The complete source code of both function could be generated by ``genFooter()``. - We can generate object files ``getuint64.o`` and printuint64.o``. The object files generated from our assembler output can be linked against these. - The ULM assembler can process a sequence of assembly files and treats this input sequence as-if it was one large source file. For now the last option listed above is the simplest. If the compiler generated ``foo.s`` then we invoke the assembler with ---- CODE (type=s) ------------------------------------------------------------- path-to-ulm/ulmas foo.s getuint64.s printuint64.s -------------------------------------------------------------------------------- In a subsequent section you will find a makefile that can be used in a separate directory for testing our compiler. In the makefile we can hide such nasty details. Changes for the Lexer ===================== We at least need three more token kinds for '`$`', '`<`' and '`>`'. But now is a good opportunity to added even a few more: +---------------------------+-----------------------+ | end of input | `EOI` | +---------------------------+-----------------------+ | bad token | `BAD_TOKEN` | +---------------------------+-----------------------+ | `[1-9][0-9]*` | `DEC_LITERAL` | +---------------------------+-----------------------+ | `0x[0-9a-fA-F]+` | `HEX_LITERAL` | +---------------------------+-----------------------+ | `0[0-7]*` | `OCT_LITERAL` | +---------------------------+-----------------------+ | `&` | `AMPERSAND` | +---------------------------+-----------------------+ | `&&` | `AMPERSAND2` | +---------------------------+-----------------------+ | `*` | `ASTERISK` | +---------------------------+-----------------------+ | `^` | `CARET` | +---------------------------+-----------------------+ | `$` | `DOLLAR` | +---------------------------+-----------------------+ | `=` | `EQUAL` | +---------------------------+-----------------------+ | `==` | `EQUAL2` | +---------------------------+-----------------------+ | `!` | `NOT` | +---------------------------+-----------------------+ | `!=` | `NOT_EQUAL` | +---------------------------+-----------------------+ | `>` | `GREATER` | +---------------------------+-----------------------+ | `>=` | `GREATER_EQUAL` | +---------------------------+-----------------------+ | `<` | `LESS` | +---------------------------+-----------------------+ | `<=` | `LESS_EQUAL` | +---------------------------+-----------------------+ | `(` | `LPAREN` | +---------------------------+-----------------------+ | `-` | `MINUS` | +---------------------------+-----------------------+ | `%` | `PERCENT` | +---------------------------+-----------------------+ | `+` | `PLUS` | +---------------------------+-----------------------+ | `)` | `RPAREN` | +---------------------------+-----------------------+ | `;` | `SEMICOLON` | +---------------------------+-----------------------+ | `/` | `SLASH` | +---------------------------+-----------------------+ | `~` | `TILDE` | +---------------------------+-----------------------+ | `|` | `VBAR` | +---------------------------+-----------------------+ | `||` | `VBAR2` | +---------------------------+-----------------------+ | `[a-zA-Z_][a-zA-Z_0-9]*` | IDENTIFIER | +---------------------------+-----------------------+ Here the applied changes to _tokenkind.txt_ and _lexer.c_: :import: session23/git/abc/tokenkind.txt [fold] :import: session23/git/abc/lexer.c [fold] The test for the lexer was changed so that a bad token triggers an assertion failure. Here a minimalistic test run for the lexer: ---- SHELL (path=session23/git/abc,fold) --------------------------------------- make xtest_lexer cat test_lexer.in | ./xtest_lexer -------------------------------------------------------------------------------- The test input for the lexer was changed to :import: session23/git/abc/test_lexer.in [fold] Changes in the Code Generation Interface ======================================== The code generation interface now declares four more functions: ---- CODE (type=c) ------------------------------------------------------------- #ifndef ABC_GEN_H #define ABC_GEN_H // ... // fetch / store quad word (8 bytes) void genFetch(GenReg addr, GenReg dest); void genFetchDispl(int64_t displ, GenReg addr, GenReg dest); // new void genStore(GenReg src, GenReg addr); void genStoreDispl(GenReg src, int64_t displ, GenReg addr); // new // ... // IO hack void genOutHack(GenReg src); // new void genInHack(GenReg dest); // new #endif // ABC_GEN_H -------------------------------------------------------------------------------- Functions _genFetchDispl()_ and _genStoreDispl()_ can be used for fetch and store instructions where a displacement is needed. For example, _genFetchDispl(16, 8, 9)_ generates ---- CODE (type=s) ------------------------------------------------------------- movq 16(%8), %9 -------------------------------------------------------------------------------- If the displacement can not be encoded code get generates to compute the displaced address and for fetching data from there. For example, _genFetchDispl(256, 8, 9)_ might acquire %6 as temporary register and generate ---- CODE (type=s) ------------------------------------------------------------- ldzwq 0x100, %6 addq %6, %8, %6 movq (%6), %9 -------------------------------------------------------------------------------- Function _genHackOut()_ generates code to call a function for printing the unsigned integer in the register specified by _src_. Function _genInHack()_ generates code to call a function for reading-in an unsigned integer and for storing it in the register specified by _dest_. Here the complete header and source file: :import: session23/git/abc/gen.h [fold] :import: session23/git/abc/gen.c [fold] Changes in the Parser ===================== In the parser only function _parse()_ needs to be changed: ---- CODE (type=c) ------------------------------------------------------------- void parse(void) { while (token.kind != EOI) { if (token.kind == DOLLAR) { getToken(); if (token.kind == GREATER) { getToken(); // read unsigned integer GenReg dest = genGetReg(), val = genGetReg(); struct TokenPos pos = token.pos; const struct Expr *expr = parseExpr(); if (!isLValueExpr(expr)) { errorAtPos(pos, "L-value expected"); } genInHack(val); loadExprAddr(expr, dest); genStore(val, dest); genUngetReg(dest); genUngetReg(val); } else if (token.kind == LESS) { getToken(); // print unsigned integer GenReg src = genGetReg(); const struct Expr *expr = parseAssignmentExpr(); loadExpr(expr, src); genOutHack(src); genUngetReg(src); } else { expectedError("'>' or '<'"); } expected(SEMICOLON); getToken(); } else { parseExprStatement(); } } } -------------------------------------------------------------------------------- Testing the Compiler ==================== Run ``make xtest_abc`` to build the compiler: ---- SHELL (path=session23/git/abc, fold) -------------------------------------- make xtest_abc -------------------------------------------------------------------------------- Now a simple program like ---- CODE (file=session23/git/abc/print_42.abc) -------------------------------- x = 21; x = x * 2; $< x; -------------------------------------------------------------------------------- can be used to print out the value of a variable. With the compiler we first generate the assembly code. Currently the compiler reads the source code from _stdin_ so we have to redirect the source file with '``<``': ---- SHELL (path=session23/git/abc, fold) -------------------------------------- ./xtest_abc print_42.s < print_42.abc -------------------------------------------------------------------------------- Here the generated assembly code: :import: session23/git/abc/print_42.s [fold] Next we can use the ULM assembler to generate an executable. In addition to the generated assembly code ``print_42.s`` we also pass the source files ``getuint64.s`` and ``printuint64.s``: ---- SHELL (path=session23/git/abc, hide) -------------------------------------- cp examples/*uint*.s . -------------------------------------------------------------------------------- ---- SHELL (path=session23/git/abc, fold) -------------------------------------- ../ulm-generator/1_ulm_build/stack/ulmas print_42.s getuint64.s printuint64.s -------------------------------------------------------------------------------- This produced _a.out_ :import: session23/git/abc/a.out [fold] Which now can be executed: ---- SHELL (path=session23/git/abc, fold) -------------------------------------- ../ulm-generator/1_ulm_build/stack/ulm a.out -------------------------------------------------------------------------------- If you find it inconvenient to specify the path to ULM for executing the program you can turn it into an executable shell script. The first line of an executable shell script contains after the __Shebang__ the path to a program which receives the rest of the file as argument. So in the case above the first line should be ---- CODE (type=sh) ------------------------------------------------------------ #! ../ulm-generator/1_ulm_build/stack/ulm -------------------------------------------------------------------------------- With the two Unix commands ---- SHELL (path=session23/git/abc, fold) -------------------------------------- (echo '#! ../ulm-generator/1_ulm_build/stack/ulm'; cat a.out) > print_42 chmod +x print_42 -------------------------------------------------------------------------------- a new file ``print_42`` was created with the Shebang for the ULM in the first line followed by the content of _a.out_. :import: session23/git/abc/print_42 [fold] With _chmod_ this file was made executable: ---- SHELL (path=session23/git/abc, fold) -------------------------------------- ./print_42 -------------------------------------------------------------------------------- Of course the creation of an executable can be automatized with a makefile. :links: Shebang -> https://en.wikipedia.org/wiki/Shebang_(Unix) Example Directory and Makefile ============================== Create a subdirectory and save there following files (make sure the makefile has poper tabs): :import: session23/git/abc/examples/Makefile [fold] :import: session23/git/abc/examples/getuint64.s [fold] :import: session23/git/abc/examples/printuint64.s [fold] :import: session23/git/abc/examples/test.abc [fold] Create an addition file "path-to-ulm" that contains the path to your ULM assembler and the virtual machine. For example ---- CODE (file=session23/git/abc/examples/path-to-ulm) ------------------------ $HOME/tmp/ulm-generator/1_ulm_build/stack/ -------------------------------------------------------------------------------- or ---- CODE (file=session23/git/abc/examples/path-to-ulm) ------------------------ ../../ulm-generator/1_ulm_build/stack/ -------------------------------------------------------------------------------- Then with ``make`` all files with the extension '``.abc``' will be translated into executables: ---- SHELL (path=session23/git/abc/examples) ----------------------------------- make echo 4 | ./test --------------------------------------------------------------------------------