Enum Constants
Using an enumeration declaration, symbols can be associated with numerical values. The compiler then replaces these symbols with the numerical value. We're already familiar with one example. EOF is replaced by -1 by the compiler. The purpose of such constants is to make the code more readable.
In our example, the following enumeration declaration could be useful:
1 2 3 4 5 6 7 8 9 10 | enum TokenKind { EOI, // = 0 (end of input) BAD, INTEGER, PLUS, ASTERISK, SEMICOLON, LPAREN, RPAREN, }; |
The symbol EOI (“End of Input”) is then replaced with 0, the symbol BAD with 1, INTEGER with 2, and so on. The symbols are thus sequentially numbered by the compiler starting from zero.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | @ <stdio.hdr> enum TokenKind { EOI, // = 0 (end of input) BAD, INTEGER, PLUS, ASTERISK, SEMICOLON, LPAREN, RPAREN, }; global token: int = 0; global ch: int = 0; global val: int = 0; fn getToken() { if (ch >= '0' && ch <= '9') { val = 0; while (ch >= '0' && ch <= '9') { val = val * 10; val = val + ch - '0'; ch = getchar(); } token = INTEGER; } else if (ch == '+') { ch = getchar(); token = PLUS; } else if (ch == '*') { ch = getchar(); token = ASTERISK; } else if (ch == ';') { ch = getchar(); token = SEMICOLON; } else if (ch == '(') { ch = getchar(); token = LPAREN; } else if (ch == ')') { ch = getchar(); token = RPAREN; } else if (ch == EOF) { token = EOI; } else { ch = getchar(); token = BAD; } } fn main() { while (true) { getToken(); if (token == EOI) { break; } else if (token == INTEGER) { printf("token = %d, val = %d\n", token, val); } else { printf("token = %d\n", token); } } } |
Tasks
-
Adapt and test the code.
-
If neither a number, '+', '*', '(', or ')' is found, getToken sets the variable token to BAD. With the next modification an “invalid” character gets ignored, and the program instead searches for the next “interesting” character.
Replace line 47 (where 'token' is overwritten with BAD) with:
47
getToken();
In particular, spaces and line breaks will now be ignored, as shown in the following test:
1 2 3 4 5 6 7 8 9 10 11 12
MCL:session1 lehn$ ./a.out 12 + 3 * (4 + 5); token = 2, val = 12 token = 3 token = 2, val = 3 token = 4 token = 6 token = 2, val = 4 token = 3 token = 2, val = 5 token = 7 token = 5