Modula-2 || Compiler & Tools || Library || Search Engine
Basically myacc has the same capabilities than yacc, even if some changes in the declaration part were necessary to reflect the specific demands of Modula-2. In comparison to yacc some additional features have been added to improve error handling of the created parser.
Myacc supports following options:
Input files to myacc are separated by %%-marks into a (optional) declaration section, the grammar rules and some additional Modula-2-text (in the following always referred as ANY).
%case
variant
{
;
variant
}
%end
The following declarations have exactly the same semantic as the corresponding declarations of yacc, but follow a slightly modified syntax. Myacc will not recognize other keywords than those listed. Please note that typename always refers to the left hand side of a variant in the %case..%end-construction. Single character constants (char) may be specified octal or as a string of length one.
%token [ <typename> ] (identifier|char) {identifier|char}
%left | [ <typename> ] (identifier|char) {identifier|char} |
%right | [ <typename> ] (identifier|char) {identifier|char} |
%non | [ <typename> ] (identifier|char) {identifier|char} |
%type <typename> (identifier|char) {identifier|char}
The following both declarations are intended to support keyword recognition.
%keyword [ <typename> ] identifier {identifier}
PROCEDURE procedure (keywordtext : ARRAY OF CHAR; tokenvalue : INTEGER);%init must not be preceded by nothing else that a %imports-definition.
$[< typename> ][-] number
Internally used names are introduced by yy.
All objects declared by the user are global to the module. Statements must be part of a procedure. Initialization of global variables can be realized as an action of the first (empty) rule.
Actions must contain nothing else but legal statement sequences.
The output files contain line number information for debugging of compilation errors.
TYPE YYSTYPE = RECORD CASE : CARDINAL OF | 1: var1: type1; | 2: var2: type2; END END;(* Token definitions *) CONST token1 = 257; token2 = 258; (* ... *)
VAR yylval: YYSTYPE;
PROCEDURE yyparse() : INTEGER; PROCEDURE yytoktext(tok : INTEGER; VAR text : ARRAY OF CHAR);
CONST YYERRCODE = 256;
TYPE YYTOKSET = SET OF [0..511];
VAR yylex : PROCEDURE () : INTEGER; yyerror: PROCEDURE ( (* errorno: *) INTEGER, (* no. of error *) (* line: *) INTEGER, (* -1 if unknown *) (* col: *) INTEGER, (* -1 if unknown *) (* errortoken: *) INTEGER, (*illegal input sym.*) (* errortext: *) ARRAY OF CHAR, (* "" if unknown *) (* expected: *) YYTOKSET); (* empty without *) (* -e option *) yytext : POINTER TO CHAR; yyline : POINTER TO CARDINAL; yycol : POINTER TO CARDINAL;
YYSTYPE defines the type of value stack. The case variants result from the %case-definition.
The input symbols to be recognized by lexical analysis are defined as constants. Their names are equal to those used in the token definitions of the input grammar. Occasionally Myacc adds the suffix sy to avoid conflicts with predefined Modula-2 names.
yyparse executes syntax analysis, repeatly calling yylex to obtain the next symbol from input stream. A suitable procedure has to be assigned to yylex outside the parser module. yyparse expects this procedure to return:
yylval is the value associated with the current input symbol and must be set by the lexical analyzer.
yyparse returns 0 on successful completion of syntax analysis, -1 in case of an unrecovered syntax error and a value < -1 if parsing tables could not be loaded.
yytoktext yields a printable text for any token tok.
Any syntax error will cause yyparse to call yyerror no matter if the error is recovered or not. Myacc assigns a default error handling routine to yyerror, that will receive the indicated arguments from the parser. By default yyparse provides information about the illegal symbol ('errortoken') and the consecutive number ('errorno') of the error currently treated. If myacc was called with option -e a set of legal input symbols ('expected') will be computed as well. Information about position ('line', 'col') and text ('errortext') of erroneous input symbols are available only if yyparse can dereference yyline, yycol and yytext. (i.e. if these pointers have been assigned to the address of variables that hold these information).
[syntax error in line 1] near identifier 'a' (column 1). Expected: token1 token2 token3
Of course the set of legal input symbols created by yyparse (option -e required) depends on the current parsing state. It will contain all tokens listed in the verbose file as legal input symbols to cause a shift or reduce action of the parser.
Legal input symbols tend to hide behind the default parsing actions marked $else in the verbose file (yacc outputs .instead). Myacc cannot include these symbols into the set of expected tokens, but adding some more error tokens into the grammar rules may uncover them.
The procedures listed above are available within actions: yyclearin and yyclearok have the same meaning than the corresponding macros of yacc, yyreset resets the initial parser state and yyexit terminates syntax analysis with exitcode returned by yyparse as the result of syntax analysis. yyshowerror is the the default error handling routine of any parser created by myacc.
With the exception of the language dependent features any description of yacc applies for myacc accordingly. Thus you may refer to the following references for an introduction into usage of myacc:
A. T. Schreiner & H. G. Friedman, Jr.
Introduction to Compiler Construction with UNIX
Prentice-Hall 1985.
A German translation is available as well (Hanser 1985).
Stephen C. Johnson
Yacc: Yet Another Compiler-Compiler
Programmers Workbench (Edition VII)
Myacc.m2 | parser implementation module |
Myacc.d | parser definition module |
Myacc.dy | parser definition module (option -r) |
Myacc.t | parsing tables (option -l) |
Myacc.out | verbose file (option -v) |
Myacc.act | temporary |
Myacc.dat | temporary |
Myacc.loc | temporary |
Myacc.tmp | temporary |
/usr/local/lib/myaccpar | parser skeleton |
Error messages issued by myacc are intended to be self-explanatory but sometimes they are not.
Ambiguous declarations will not be recognized in any case.
Myacc does not care about whether the types presented in a <typename> -construction are legal or not.
Unterminated actions tend to produce cascades of error messages (the last line will indicate their beginning).
If things go wrong myacc occasionally complains about non existing streams it cannot close. These messages should be ignored.
Option -b should be used only if the input grammar is accepted by myacc without a fatal error message.
Modula-2 || Compiler & Tools || Library || Search Engine