Oberon || Library || Module Index || Search Engine || Definition || Module

Ulm's Oberon Library:


CompilerObjects - language-independent base of persistent compiler objects


TYPE Location = POINTER TO LocationRec;
TYPE LocationRec =
      src: Sources.Source;
      begin, end: Streams.Count;

TYPE Object = POINTER TO ObjectRec; TYPE ObjectRec = RECORD (PersistentDisciplines.ObjectRec) loc: Location; END;

TYPE Attachment = POINTER TO AttachmentRec; TYPE AttachmentRec = RECORD (ObjectRec) END;

CONST public = 1; private = 2; TYPE ObjectType = SHORTINT; (* public or private *) CONST archIndependent = 1; CONST archDependent = 3; TYPE Stage = SHORTINT; (* archIndependent or archDependent *) TYPE CacheMode = SET; CONST cachePublic = {public * archIndependent, public * archDependent}; CONST cachePrivate = {private * archIndependent, private * archDependent}; CONST cacheAll = cachePublic + cachePrivate; CONST cacheArchIndependent = {public * archIndependent, private * archIndependent}; CONST cacheArchDependent = {public * archDependent, private * archDependent};

TYPE Header = POINTER TO HeaderRec; TYPE HeaderRec = RECORD (PersistentDisciplines.ObjectRec) modname: ConstStrings.String; src: Sources.Source; srcid: ConstStrings.String; key: CompilerKeys.Key; dependencies: CompilerKeys.Set; type: ObjectType; stage: Stage; arch: Architectures.Architecture; END;

TYPE ModuleTable = POINTER TO ModuleTableRec; TYPE ModuleTableRec = RECORD (Disciplines.ObjectRec) END;

CONST cannotOpenObjectText = 0; CONST cannotReadTextHeader = 1; CONST cannotDecodeObject = 2; CONST corruptedInput = 3; CONST invalidStructure = 4; CONST errors = 5; TYPE ErrorCode = SHORTINT; (* cannotOpenObjectText ... *) TYPE ErrorEvent = POINTER TO ErrorEventRec; TYPE ErrorEventRec = RECORD (Events.EventRec) code: ErrorCode; modname: ConstStrings.String; END; VAR error: Events.EventType; VAR errormsg: ARRAY errors OF Events.Message;

PROCEDURE CreateLocation(VAR location: Location; src: Sources.Source; begin, end: Streams.Count);

PROCEDURE CreateHeader(VAR header: Header; modname: ConstStrings.String);

PROCEDURE CreateModuleTable(VAR mtab: ModuleTable); PROCEDURE AddModule(mtab: ModuleTable; header: Header; module: Object); PROCEDURE AddHeader(mtab: ModuleTable; header: Header); PROCEDURE Lookup(mtab: ModuleTable; modname: ConstStrings.String; type: ObjectType; arch: Architectures.Architecture; VAR header: Header; VAR module: Object) : BOOLEAN; PROCEDURE LookupHeader(mtab: ModuleTable; modname: ConstStrings.String; type: ObjectType; arch: Architectures.Architecture; VAR header: Header) : BOOLEAN;

PROCEDURE Init(object: Object); PROCEDURE InitBuiltInObject(object: Object); PROCEDURE Attach(object: Object; attachment: Attachment); PROCEDURE InclAttachment(object: Object; attachment: Attachment); PROCEDURE GetAttachment(object: Object; VAR attachment: Attachment);

PROCEDURE ConvertObjectToText( object: Object; table: ModularizedStructures.ObjectTable; header: Header; VAR text: PersistentTexts.Text; errors: RelatedEvents.Object) : BOOLEAN; PROCEDURE ConvertTextToObject( text: PersistentTexts.Text; table: ModularizedStructures.ObjectTable; header: Header; VAR object: Object; errors: RelatedEvents.Object) : BOOLEAN; PROCEDURE GuardedRead(s: Streams.Stream; guard: Services.Type; VAR object: CompilerObjects.Object) : BOOLEAN; PROCEDURE Write(s: Streams.Stream; object: CompilerObjects.Object) : BOOLEAN;


CompilerObjects provides a language-independent base of persistent compiler objects. Successful compilation runs return an object of type Object (a so-called root object) that references further persistent objects either generated by this run or by other runs of imported modules by means of ModularizedStructures. Each compilation result is accompanied by a header that allows to examine meta data without accessing or restoring its associated objects.

Not just root objects should be extensions of CompilerObjects.Object but every object that may be referenced by compilation results of other modules (like symbol table entries) as Write and GuardedRead provide the necessary support for ModularizedStructures. Compiler objects should never written or read to external streams but instead converted to or from persistent texts (see PersistentTexts) by ConvertObjectToText and ConvertTextToObject, respectively. This is necessary to store and load compilations results without taking care of topological orders (as required by ModularizedStructures) and to avoid time-consuming restoration operations as long they are not strictly required.

Compiler objects have optionally a location of type Location that references a stretch of bytes inside a source using stream positions of Streams. These locations may later be used on the generation of error events (see CompilerErrors). Compiler locations are expected to be precise by not just noting a single position but the full textual representation of a syntactical construct that is represented by a compiler object. While locations of leaf objects of an abstract syntax tree may just cover a token, upper nodes more close to the root should have a location that is the smallest interval including the locations of all sub nodes.

Most languages come with built-in objects representing, for example, basic types like integers and floating point numbers. One way to represent them as persistent objects is to treat them as items of a separate module but this is not always practicable. CompilerObjects allows built-in objects to be tagged as such, and, that is to be considered with care, saves and restores them not using ModularizedStructures but directly by LinearizedStructures which causes them to be cloned. This leads to multiple incarnations of built-in objects representing the same built-in language construct if they are loaded from different modules.

In a multi-stage compilation environment where the first pass generates an architecture-independent abstract syntax tree, followed by a second pass that generates an architecture-dependent interface usable for compilation runs of other modules importing this module, it might be useful to have twin objects. One object represents the architecture-independent part and belongs to a separate module in the sense of ModularizedStructures. The second object provides additional architecture-dependent informations for the first object. The second object can easily reference the first object but not vice versa. This is not just due to the avoidance of cyclic references (as required by ModularizedStructures) but also under the consideration that there might be many different architectures and therefore different architecture-dependent extensions to one architecture-independent syntax tree.

This problem is solved by so-called attachments of type Attachment which is a specific extension of Object. Attach creates a persistent tie between object and attachment, and GetAttachment returns the attachment belonging to object. While any number of attachments for one object may have once been created (usually one for each architecture), there must never be more than one attachment in existence in memory for each object instance in memory. Hence, if many different architecture-dependent informations are needed for one object, a single attachment object should carry all of them, either directly or by use of PersistentDisciplines. Note, however, that PersistentDisciplines or other means must not be used to modify loaded compilation results from earlier compilation runs as they have to be treated as read-only.

Headers provide all language-independent meta informations needed of a result generated by a compiler:

Module name that must conform to that of the source reference src.modname if src is non-NIL (see CompilerSources).
Source reference, may be NIL.
Source identification (see CompilerSources), must be non-NIL.
Associated interface-branding key (see CompilerKeys).
Set of dependencies to other interfaces (see CompilerKeys).
Type of compiler result: either public (interface information usable on compiler runs of importing modules), or private (final machine code or an intermediate architecture-independent state representing the abstract syntax tree).
Either archIndependent or archDependent.
Concrete architecture: NIL if stage equals archIndependent, and non-NIL otherwise (see Architectures).

The integer constants of ObjectType and Stage have been chosen in a way that gives any possible combination of object type and stage a unique number. Sets of these combinations are cache modes that are used by Compilers to decide which kinds of intermediate results should be kept in storage for further compilations.

A module table allows to collect all compilation results that are loaded or generated during a compilation run. It is as such part of the context of a compilation (see Compilers) and maintained by the object loader (see ObjectLoader). Module tables have at maximum two entries per module: public interface and private stuff (abstract syntax tree or machine code). Entries consists of a header and, optionally, the loaded compilation result. Entries may be upgraded, i.e. they may advance from architecture-independent to architecture-dependent, and compilation results may be added where previously just a header was present. Module tables must always be consistent in the sense of CompilerKeys and violations lead to failed assertions. CreateLocation creates a location record representing the byte stretch [begin, end) of src. Except for noting the end of a source, end should be larger than begin. Locations are usually stored into the location component of compiler objects and later used to generate error messages (see CompilerErrors) or allow debuggers to display source texts.

CreateHeader creates and initializes a header object of modname. Note that the remaining components need to be initialized before returning it as compilation result or including it to a module table.

CreateModuleTable creates an empty module table. While AddHeader adds an entry consisting only of a header to mtab, AddModule adds both of them, header and an object representing the result of a compilation. Upgrades are permitted, i.e. AddModule may be called for modules already added to add a compilation result where the header only was formerly known, to advance a compilations result from architecture-independent to architecture-dependent, or to add the compilation result of an entire module where just the public interface was part of the module table before. Note, however, that compatibility in the sense of CompilerKeys is to be strictly preserved.

Lookup and LookupHeader allow to look up compiler objects of the module table by their module name, the object type (public or private), and their architecture. Note that the architecture passed to the look up procedures needs just to be compatible with that of the object look for (see Architectures).

Init and InitBuiltInObject allow to initialize ordinary and built-in compiler objects. Note that Init may be followed by InitBuiltInObject as long object has not been tied with a module before by writing it using Write.

Attach creates a persistent association between attachment and object. InclAttachment should be called for all attachments using a root object belonging to the same module in the sense of ModularizedStructures. This may be omitted if these attachments are already connected to the root object by other persistent references that do not cross module boundaries. GetAttachment allows to retrieve the attachment belonging to an object. NIL is returned if there is no attachment present.

ConvertObjectToText and ConvertTextToObject convert root objects of compilation results to persistent texts and vice versa.

GuardedRead and Write are to be used in marshalling procedures to read and write compiler objects.


Following error events may be raised by CompilerObjects in its marshalling and conversion procedures:
is returned by ConvertTextToObject if it is unable to open the persistent text object (see PersistentTexts).
is returned by ConvertTextToObject on failures to read the module name that is expected at the beginning of text.
is returned by ConvertTextToObject on failures of GuardedRead.
is returned by GuardedRead and the marshalling read procedure for attachment objects in case of inconsistencies.
is raised by GuardedRead in case of lookup failures regarding references to foreign persistent modules (see ModularizedStructures).

In addition, various assertions check the validity of module tabs, headers, and parameters.


classification of target architectures
keys that identify dependencies of compiler-generated objects
objects representing source texts
service provider of StreamPosKeys for extensions of CompilerObjects.Object
general language-independent compiler interface
modularization of persistent object structures
general language-independent loader
persistent text objects

Edited by: borchert, last change: 2004/06/24, revision: 1.3, converted to HTML: 2004/06/24

Oberon || Library || Module Index || Search Engine || Definition || Module