Some standard library for the ULM
Besides its core language a programming language typically also defines a standard library, in the case of C the C standard library. For the ULM C dialect we will rebuild some parts of the C standard library.
Currently the ULM C compiler only translates source code into assembly code for the ULM instruction set. Use your imagination that it also could produce assembly code for other architectures, e.g. the Intel Architecture or ARM architecture. That means code written for the ULM C compiler is platform independent in the sense that it just needs to be re-compiled on each platform. So obviously we want to write as much as possible in C and as few as possible in assembly. Porting the library to another platform then only requires to adapt the assembly fraction.
Initial code base for the library used in this session
On theon the directory /home/numerik/pub/libulm_initial/ contains all sources files for the library, some test programs and a makefile:
theon$ pwd /home/numerik/pub/libulm_initial theon$ ls crt0.s Makefile putchar.s puts.c putui.s xanswer.s xhello_in_c_gcc xhello_in_c.c xhello.s theon$
The makefile is a derivate from Session 11. It allows that source files are written in C or assembly:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | Lib := libulm.a
CC := ulmcc
AS := ulmas
LD := ulmld
LDFLAGS := $(Lib)
RANLIB := ulmranlib
# files of form x*.s or x*.c are test programs and an executable x* gets
# created.
TestTargets := $(patsubst %.s,%,$(wildcard x*.s)) \
$(patsubst %.c,%,$(wildcard x*.c))
# machinery to cleanup the directory if test programs were renamed or deleted
XObjRemoves := $(filter-out $(patsubst %,%.o,$(TestTargets)),$(wildcard x*.o))
XTstRemoves := $(patsubst %.o,%,$(XObjRemoves))
XSrcRemoves := $(if $(XObjRemoves),xsrcRemoves)
# all other files of form *.s or *.c are part of the library
LibSources := $(filter-out x%.s,$(wildcard *.s)) \
$(filter-out x%.c,$(wildcard *.c))
LibObjects := $(patsubst %.c,%.o,$(patsubst %.s,%.o,$(LibSources)))
# machinery to cleanup the archive if source files for the library were renamed
# or deleted
LibContent := $(if $(wildcard $(Lib)),$(shell ar t $(Lib) | grep -v "^__"),)
LibRemoves := $(filter-out $(LibObjects),$(LibContent))
SrcRemoves := $(if $(LibRemoves),srcRemoves)
ArDelete := $(if $(LibRemoves),ar d $(Lib) $(LibRemoves),)
.PHONY: all clean srcRemoves xsrcRemoves
all: $(TestTargets) $(Lib) $(XSrcRemoves)
clean:
$(RM) $(TestTargets) *.o $(Lib)
$(TestTargets): % : %.o $(Lib)
$(LD) -o $@ $^
$(XSrcRemoves) :
$(RM) $(XObjRemoves) $(XTstRemoves)
%.o : %.c
$(CC) -o $*.s $^
$(AS) -o $*.o $*.s
$(RM) $*.s
# $(Lib)(%) : %
# $(AR) cr $@ $^
$(SrcRemoves) :
$(ArDelete)
$(Lib) : $(Lib)($(LibObjects)) $(SrcRemoves)
$(RANLIB) $(Lib)
|
As usual a simple make will build everything, i.e. the library and the test programs:
theon$ make ulmas -o xanswer.o xanswer.s ulmas -o crt0.o crt0.s ar rv libulm.a crt0.o ar: creating libulm.a a - crt0.o ulmas -o putchar.o putchar.s ar rv libulm.a putchar.o a - putchar.o ulmas -o putui.o putui.s ar rv libulm.a putui.o a - putui.o ulmcc -o puts.s puts.c ulmas -o puts.o puts.s rm -f puts.s ar rv libulm.a puts.o a - puts.o ulmranlib libulm.a ulmld -o xanswer xanswer.o libulm.a ulmas -o xhello.o xhello.s ulmld -o xhello xhello.o libulm.a ulmcc -o xhello_in_c.s xhello_in_c.c ulmas -o xhello_in_c.o xhello_in_c.s rm -f xhello_in_c.s ulmld -o xhello_in_c xhello_in_c.o libulm.a rm puts.o crt0.o putui.o putchar.o theon$
So for instance a “hello, world” program implemented in C:
theon$ xhello_in_c hello, world! theon$
And with make clean all generated files are deleted:
theon$ make clean rm -f xanswer xhello xhello_in_c *.o libulm.a theon$
Storing test programs and source files in a single directory is certainly not the right thing to do when you have a larger project but sufficient in our case. Like in the previous session we use a simple naming convention so that the build system can differentiate between them, i.e. files that begin with 'x' are test programs all other are part of the library.
However, you should get at least some impression what a kind of demands a proper build system should satisfy. First of all, it should be possible to add, delete or rename source files. Also, intermediate files (like object files) should only kept around if they can be used to speedup rebuilding the project.
Source code for the library
Files with extension “.c” or “.s” that do not begin with an “x” are source files for the library implemented in C or assembly respectively:
theon$ ls [^x]*.[cs] crt0.s putchar.s puts.c putui.s theon$
The dependencies for building the static library can be described as follows:
With make libulm.a the build system only generates or updates the library:
theon$ make libulm.a ulmas -o crt0.o crt0.s ar rv libulm.a crt0.o ar: creating libulm.a a - crt0.o ulmas -o putchar.o putchar.s ar rv libulm.a putchar.o a - putchar.o ulmas -o putui.o putui.s ar rv libulm.a putui.o a - putui.o ulmcc -o puts.s puts.c ulmas -o puts.o puts.s rm -f puts.s ar rv libulm.a puts.o a - puts.o ulmranlib libulm.a rm puts.o crt0.o putui.o putchar.o theon$
Note that all object files, and also assembly files generated from C code, are deleted afterwards. That's because they would not speedup the rebuild time. For example, if you would modify puts.c then you always have to regenerated puts.s and puts.o:
theon$ touch puts.c theon$ make libulm.a ulmcc -o puts.s puts.c ulmas -o puts.o puts.s rm -f puts.s ar rv libulm.a puts.o r - puts.o ulmranlib libulm.a rm puts.o theon$
So there is no benefit from keeping these intermediate files.
For a quick look here are the library source files viewable in the browser:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | .equ FP, 1
.equ SP, 2
.equ RET, 3
//------------------------------------------------------------------------------
// Function _start()
//------------------------------------------------------------------------------
.equ ret, 0
.equ fp, 8
.equ rval, 16
.text
.globl _start
_start:
// begin of the function body
ldzwq 0, %SP
// call function main()
subq 24, %SP, %SP
ldzwq main, %4
jmp %4, %RET
movzwq rval(%SP), %4
addq 24, %SP, %SP
halt %4
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | .equ FP, 1
.equ SP, 2
.equ RET, 3
//------------------------------------------------------------------------------
// Procedure putchar(ch)
//------------------------------------------------------------------------------
.equ ret, 0
.equ fp, 8
// procedure arguments
.equ ch, 16
.text
.globl putchar
putchar:
// function prologue
movq %RET, ret(%SP)
movq %FP, fp(%SP)
addq 0, %SP, %FP
// reserve space for 0 local variables.
subq 0, %SP, %SP
// begin of the function body
movzbq ch(%FP), %4
putc %4
// end of the function body
// function epilogue
putchar.leave:
addq 0, %FP, %SP
movq fp(%SP), %FP
movq ret(%SP), %RET
jmp %RET, %0
|
1 2 3 4 5 6 7 8 9 10 11 | extern void
putchar(char ch);
void
puts(char *str)
{
while (*str) {
putchar(*str);
++str;
}
}
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | .equ FP, 1
.equ SP, 2
.equ RET, 3
//------------------------------------------------------------------------------
// Procedure putui(n)
//------------------------------------------------------------------------------
.equ ret, 0
.equ fp, 8
// procedure arguments
.equ n, 16
// local variables
.equ p, -8
.equ buf, p-22
.text
.globl putui
putui:
// function prologue
movq %RET, ret(%SP)
movq %FP, fp(%SP)
addq 0, %SP, %FP
// reserve space for pointer p and array buf with 22 characters
subq 22, %SP, %SP
// begin of the function body
/*
p = buf;
*/
ldswq buf, %4
addq %4, %FP, %4
movq %4, p(%FP)
/*
do {
*/
putui.do:
/*
*p = n % 10 + '0';
*/
movq n(%FP), %4
ldzwq 0, %5
divq 10, %4, %4
addq '0', %6, %6
movq p(%FP), %7
movb %6, (%7)
/*
++p;
*/
movq p(%FP), %4
addq 1, %4, %4
movq %4, p(%FP)
/*
n /= 10;
*/
movq n(%FP), %4
divq 10, %4, %4
movq %4, n(%FP)
/*
} while (n!=0);
*/
movq n(%FP), %4
subq 0, %4, %0
jnz putui.do
/*
while (p != buf) {
*/
putui.while:
ldswq buf, %4
addq %4, %FP, %4
movq p(%FP), %5
subq %4, %5, %0
jz putui.while_done
/*
--p;
*/
movq p(%FP), %4
subq 1, %4, %4
movq %4, p(%FP)
/*
putchar(*p);
*/
movq p(%FP), %4
movzbq (%4), %4
putc %4
jmp putui.while
putui.while_done:
// end of the function body
// function epilogue
putui.leave:
addq 0, %FP, %SP
movq fp(%SP), %FP
movq ret(%SP), %RET
jmp %RET, %0
|
Test programs
Files that begin with a 'x' refer to a test program that can be written in C or assembly. Hence, these are the initial test programs:
theon$ ls x*.[cs] xanswer.s xhello_in_c.c xhello.s theon$
Each test program gets translated into an object file and linked against the library:
For generating specific test programs pass them as arguments to make. For instance, this just creates xhello_in_c
theon$ make xhello_in_c ulmcc -o xhello_in_c.s xhello_in_c.c ulmas -o xhello_in_c.o xhello_in_c.s rm -f xhello_in_c.s ulmld -o xhello_in_c xhello_in_c.o libulm.a theon$ ulm xhello_in_c hello, world! theon$
Quiz 15: Make puts standard conform ====================================== In this exercise you are supposed to change the implementation of puts so that it conforms to the C standard library, read here the man page of function puts. On success it returns a non-negative value on success and otherwise EOF which is a implementation dependant macro (in most implementation it expands to -1). For getting an idea why puts in general can fail read When will puts() fail? on stackoverflow.
On the ULM calling puts will always succeed so changing the implementation has the sole purpose of being compliant. This is for instanced achieved by this implementation that always returns zero (most implementations in the real world would return the number of printed characters):
1 2 3 4 5 6 7 8 9 10 11 12 | extern void
putchar(char ch);
int
puts(char *str)
{
while (*str) {
putchar(*str);
++str;
}
return 0;
}
|
So what is it that you have to do? You should get a taste of the consequences when you change the interface of a library: Adapt the test programs xhello.s and xhello_in_c.c so that the work together with the new puts function. On theon submit the source files as follows:
submit hpc quiz15 xhello.s xhello_in_c.c