Storage Classes extern and static
Exercise: Lexer for the abc Compiler
We have to apply some improvements to our compiler project that we started in Session 13, Page 4:
-
The lexer should also give as information about the line number and column in which a token was found. It should also be possible to retrieve the current token directly. And yes, we use here global variables declared in a header (in the end only lexer.h will declare a global variable, and just one).
The header was therefore modified as follows (we clearly need struct and enum soon):
#ifndef ABC_LEXER_H #define ABC_LEXER_H #include <stddef.h> /* Returns token kind: 0 = EOI (end of input) 1 = BAD_TOKEN 2 = HEX_LITERAL 3 = OCT_LITERAL 4 = DEC_LITERAL 5 = PLUS ('+') 6 = MINUS ('-') 7 = ASTERISK ('*') 8 = SLASH ('/') 9 = PERCENT ('%') 10 = EQUAL ('=') 11 = LPAREN (left paranthesis '(') 12 = RPAREN (right paranthesis ')') 13 = SEMICOLON 14 = IDENTIFIER */ int getToken(void); // direct access to current token extern int token_kind; extern size_t token_line; extern size_t token_col; #endif // ABC_LEXER_H
-
This is the modified test program. Just a line of code was added to also print the token position:
#include <stdio.h> #include "lexer.h" int main(void) { int token; while ((token = getToken()) != 0) { printf("%zu.%zu: ", token_line, token_col); if (token == 1) { printf("BAD_TOKEN\n"); } else if (token == 2) { printf("HEX_LITERAL\n"); } else if (token == 3) { printf("OCT_LITERAL\n"); } else if (token == 4) { printf("DEC_LITERAL\n"); } else if (token == 5) { printf("PLUS\n"); } } }
-
In lexer.c all functions and global variables that are not declared in the header should be declared as static.
Of course the real deal here is: Change the implementation in lexer.c so that after calling getToken() the values of the global variables declared in lexer.h are correct.
Some Example
You can use a file like this to test you lexer:
1 2 3 4 | a = 5;
b = 42;
c = (a + b) *2;
123 0123 0xaB12 abc +-/*%^()
|
Then with xtest_test you should get:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | theon$ ./xtest_lexer < test_lexer.in
1.1: IDENTIFIER
1.3: EQUAL
1.5: DEC_LITERAL
1.6: SEMICOLON
2.1: IDENTIFIER
2.3: EQUAL
2.5: DEC_LITERAL
2.7: SEMICOLON
3.1: IDENTIFIER
3.3: EQUAL
3.5: LPAREN
3.6: IDENTIFIER
3.8: PLUS
3.10: IDENTIFIER
3.11: RPAREN
3.13: ASTERISK
3.14: DEC_LITERAL
3.15: SEMICOLON
4.1: DEC_LITERAL
4.5: OCT_LITERAL
4.10: HEX_LITERAL
4.17: IDENTIFIER
4.21: PLUS
4.22: MINUS
4.23: SLASH
4.24: ASTERISK
4.25: PERCENT
4.26: BAD_TOKEN
4.27: LPAREN
4.28: RPAREN
|
Once we double checked that the lexer is handling lines and columns correctly we can save the output in a file:
1 | theon$ ./xtest_lexer < test_lexer.in > test_lexer.ref.out
|
Later we can use in the makefile a target check so that make check simply comares in furture the result with this trusted output. Basically this target will do in this case a diff:
1 2 | theon$ ./xtest_lexer < test_lexer.in > test_lexer.out
theon$ diff test_lexer.out test_lexer.ref.out
|