Oberon || Library || Module Index || Search Engine || Definition || Module


Ulm's Oberon Library:
OberonLex


NAME

OberonLex - lexical analysis of Oberon sources

SYNOPSIS

(* keywords *)
CONST array = 0; begin = 1; case = 2; const = 3; definition = 4; div = 5;
CONST do = 6; else = 7; elsif = 8; end = 9; exit = 10; for = 11; if = 12;
CONST import = 13; in = 14; is = 15; loop = 16; mod = 17; module = 18;
CONST nil = 19; of = 20; or = 21; pointer = 22; procedure = 23;
CONST record = 24; repeat = 25; return = 26; then = 27; to = 28;
CONST type = 29; until = 30; var = 31; while = 32; with = 33;
(* operators and delimiters *)
CONST plus = 40; minus = 41; times = 42; slash = 43; tilde = 44;
CONST ampersand = 45; period = 46; comma = 47; semicolon = 48; bar = 49;
CONST lparen = 50 (* "(" *); lbracket = 51 (* "[" *); lbrace = 52 (* "{" *);
CONST becomes = 53; arrow = 54;
CONST eql = 55; neq = 56; lst = 57; grt = 58; leq = 59; geq = 60;
CONST range = 61; colon = 62;
CONST rparen = 63 (* ")" *); rbracket = 64 (* "]" *); rbrace = 65 (* "}" *);
(* miscellaneous symbols *)
CONST ident = 70;
CONST intconst = 71; hexconst = 72; realconst = 73; longrealconst = 74;
CONST charconst = 75; string = 76; comment = 77; eop = 78;
symbols = 79; (* # of symbols *)
TYPE Symbol = SHORTINT;


TYPE Location = Streams.Count; TYPE Token = RECORD (Objects.ObjectRec) begin, end: Location; sy: Symbol; ident: ConstStrings.String; (* if sy = ident, NIL otherwise *) text: Streams.Stream; (* constants and comments, NIL otherwise *) END;

CONST ioError = 0; (* unable to read from source stream *) CONST backFailed = 1; (* unable to push back look-ahead *) CONST errors = 2; TYPE ErrorEvent = POINTER TO ErrorEventRec; TYPE ErrorEventRec = RECORD (Events.EventRec) s: Streams.Stream; begin, end: Location; END; VAR error: Events.EventType; VAR errormsg: ARRAY errors OF Events.Message;

PROCEDURE GetToken(s: Streams.Stream; VAR token: Token); PROCEDURE GetSymString(sy: Symbol; VAR string: ARRAY OF CHAR); PROCEDURE SetStringDomain(s: Streams.Stream; domain: ConstStrings.Domain);

DESCRIPTION

OberonLex supports the lexical analysis of Oberon and Oberon-2 programs.

GetToken starts reading from the current position of s and returns the next input token in token. Note that the stream must be buffered and support tell operations. Buffering is needed for Streams.Back (a look-ahead of one character is needed) and the tell operation is used to record the location of the token. Stream locations may be converted into line/column pairs by Lines.

White space (as defined by StreamDisciplines) and comments are skipped. Note that all characters, which do not belong to the set of white space characters or to the set of characters legal Oberon tokens may be composed of, are considered illegal. On end of file (Streams.eof), or in case of input errors, the special token eop will be returned. To avoid an infinite amout of error messages on accidently submitted binary files, a fatal input error will be generated if the fraction of illegal characters becomes too large.

All keywords and symbols of Oberon and Oberon-2 are recognized. On parsing Oberon, the symbol for should be converted into ident. Constants are accepted in conformance to the Oberon reports. They are not interpreted, however, and returned as text stream instead because their interpretation could be (at least in case of floating point values) machine-dependent.

GetSymString converts a symbol into a printable string. This may be useful for error messages.

SetStringDomain should be preferably called before GetToken and allows to set the domain which all strings returned in tokens belong to. Note that even keywords are returned in the string component and they belong to this domain as well. If SetStringDomain has not been called for s, GetToken will create a domain of ConstStrings on its own.

DIAGNOSTICS

OberonLex generates two kinds of errors:

All error events are related to the stream. There is an upper limit of lexical errors (100) after which OberonLex gives up.

SEE ALSO

CompilerErrors
general compiler error events
ConstStrings
read-only strings of arbitrary length
RelatedEvents
error handling
StreamDisciplines
definition of white space
Streams
abstraction for byte sequences

Edited by: borchert, last change: 2006/08/21, revision: 1.2, converted to HTML: 2006/08/21

Oberon || Library || Module Index || Search Engine || Definition || Module