Calling conventions
What is shown here is a simple and efficient way for realizing a function call convention. However, there is a serious catch to it. It only works for so called leaf functions, i.e. functions that do not call another functions. So in particular something like recursive function calls will not work with that.
At the moment we develop concepts for programming a computer in general and do not really focus on efficiency. In Session 8 you saw how difficult it can get to write fairly complex programs without having a possibility to break components into reusable functions (for reading an integer, computing something, printing an integer). What we did there was thinking about a program that consisted of such parts (i.e. functions) but instead of calling a function we just inserted its code where it was needed. In the real world this is an optimization called function inlining. This is an optimization because function calls always have some overhead. You will see here that this overhead consists of additional code for passing arguments, calling the function, returning form the function etc.
With respect to efficiency the function call convention would be the next best thing compared to inlining the call. But again, efficiency is not our main concern here, it is still about programming the ULM in general. So let's say we look at this convention because compared to the general case (which includes non-leaf functions) it is easier to deal with the special case and for this case we seek the simplest, possible solution.
Using the instruction jmp %CALL, %RET for function calls
Key for supporting functions calls on the ULM is the jmp instruction with Opcode 0x40. It has two register operands to which we refer as %CALL and %RET respectively. When executed it does two things:
-
It jumps to the address in %CALL and
-
stores the return address %RET.
So what does return address mean? It is the address of the instruction that follows the jump instruction in memory. For example, when the jump instruction is at address 0x20 then the return address is 0x24.
The sequential execution and the implementation pattern for a program that calls a leaf-function func twice can be visualised as follows:
And analogously cases where different functions are called the flow chart can be represented like this:
You want a more technical explanation?
Recall that the ULM implements this simple von Neumann cycle for executing a program. So when an instruction gets executed it was loaded from address %IP (instruction pointer) into the %IR (instruction register). So what the instruction does when it gets executed is
\[\left( u(\%\text{ip}) + 4 \right) \bmod 2^{64} \to u(\%\text{RET})\quad\text{and}\quad u(\%\text{CALL}) \to u(\%\text{ip})\]Returning from a function is then possible with a jmp %RET, %0 instruction. Of course this requires that %RET by then still (or again) contains the original return address. So when you program you should know that you can not simply use register %RET without care. So you see that we need some rules, a protocol or call it convention for function calls.
Descriptions of such conventions usually the terms caller and callee. As long as we only deal with leaf functions these terms have a simple meaning. The program text can be divided into a main program and functions. In the main program we only have code that eventually calls a function, i.e. caller code. In the functions we only have code that gets executed when a function was called, i.e. callee code. Now think of two persons (caller and callee) writing a program together. The caller is writing the main program, the callee all the functions. The function call conventions are a contract between the caller and callee and allows to collaborate almost independently from each other. Actually the contract specifies what information one has to provide to another.
Caller
The caller has to know the address of the function, how to pass arguments and eventually how to get a result back from the function. The caller also has to know what registers the function might modify.
In the simplest form the function does not expect any arguments and does not return a result. In this case the caller just loads the address of the function into %CALL and does the (function) jump:
Also shown in this picture is the entry point of the program. The entry point is the address of the instruction that gets executed first when execution of the program starts. Of course a correct program should also have at least one exit point, i.e. some halt instruction. As a programmer you have to make sure that at least on of these exit points gets reached.
Callee
Accordingly the callee needs to know how to receive arguments, how to return and eventually how to give back a result. And of course, what registers can be used and modified for computations.
Again we first consider that the function does not receive any arguments and does not return any result. Then we basically have this (we talk about what registers the function implementation is allowed to modify below):
Example
So here another “hello, world” program:
theon$ ulmas -o func01 func01.s theon$ ulm func01 hello, world! hello, world! theon$
This time it uses functions. For the sake of simplicity we first stick to the case where the function does not receive arguments, does not return a result and not even uses any registers. But it has two functions and does four function calls:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | .equ CALL, 1
.equ RET, CALL + 1
.text
/*
Entry point of the program
*/
# hello_function();
ldzwq hello_function, %CALL
jmp %CALL, %RET
# world_function();
ldzwq world_function, %CALL
jmp %CALL, %RET
# hello_function();
ldzwq hello_function, %CALL
jmp %CALL, %RET
# world_function();
ldzwq world_function, %CALL
jmp %CALL, %RET
/*
Exit point
*/
halt 0
//------------------------------------------------------------------------------
/*
void
hello_function();
*/
hello_function:
putc 'h'
putc 'e'
putc 'l'
putc 'l'
putc 'o'
putc ','
putc ' '
jmp %RET, %0
/*
void
hello_function();
*/
world_function:
putc 'w'
putc 'o'
putc 'r'
putc 'l'
putc 'd'
putc '!'
putc '\n'
jmp %RET, %0
|
Callee, the perfect roomer
The caller and callee need a protocol that specifies what registers (and as we talk about later what memory region) can be used without conflicts. We keep it simple here: The callee is only allowed to modify registers that we denote as %CALLEE0, %CALLEE1, etc. This will be defined through some .equ directives. For example in the next example we define
1 2 | .equ CALLEE0, 1
.equ CALLEE1, CALLEE0 + 1
|
Think of this like a contract between you and a roomer that lives in your apartment. This contract specifies what the roomer is allowed to use in your apartment.
In other examples we give the callee access to more registers but it always will be limited. Of course the ULM would have enough registers to share the registers more equally. But first of all, we want to keep it close enough to reality where most computers have much fewer registers and they can live with that (And so can we: The ULM has only 256 registers because that makes it simple to read the machine code).
Next, a callee can actually use any register as long as the callee can restore them after usage. That means before returning to the caller all registers that do not belong to the callee must have their original value. So no harm done. Like in the example with the roomer in your apartment. You certainly don't mind if someone uses your stuff if it is guarantied that you will never notice ;-)
Furthermore, sooner or later we want to think about functions that are non-leaf functions, i.e functions that call other functions. In this case the callee becomes a caller (and then again a callee before it returns). So just equally sharing registers between caller and callee would be a dead end.
But let's talk about here and now. We still have to specify how arguments are passed and results returned.
Passing arguments
A function receives its arguments (if any) in the callee registers: The first argument in %CALLEE0, the second in %CALLEE1, etc.
The caller has to know how many arguments are expected by the callee. There is no way that the assembler can check if you are passing to few or too many arguments. So the programmer of the function has to document this (or the caller would have to look at the code and figure that out).
Receiving results
If a function returns a value this value will be store in %CALLEE0 after the return. So the callee has to put it there before returning.
Example for a puts function
We continue with a multilingual version of the “hello, world” program
theon$ ulmas -o func02 func02.s theon$ ulm func02 hello, world! Hallo Welt!🍺 你好,世界 theon$
In this example the function puts expects one argument (in %CALLEE0) which is a pointer to a string that will be printed by the function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | # Example for realizing leaf functions (recursive calls will fail!)
# registers that a callee can use
.equ CALLEE0, 1
.equ CALLEE1, CALLEE0 + 1
# all other registers belong to the caller
.equ CALL, CALLEE1 + 1
.equ RET, CALL + 1
.data
str0 .string "hello, world!\n"
str1 .string "Hallo Welt!🍺\n"
str2 .string "你好,世界\n"
.text
/*
Entry point of the program
*/
# puts(str0);
ldzwq puts, %CALL
ldzwq str0, %CALLEE0
jmp %CALL, %RET
# puts(str1);
ldzwq puts, %CALL
ldzwq str1, %CALLEE0
jmp %CALL, %RET
# puts(str2);
ldzwq puts, %CALL
ldzwq str2, %CALLEE0
jmp %CALL, %RET
/*
Exit point
*/
halt 0
//------------------------------------------------------------------------------
/*
void
puts(const char *);
*/
puts:
.L0 movzbq (%CALLEE0), %CALLEE1
subq 0, %CALLEE1, %0
jz .L1
putc %CALLEE1
addq 1, %CALLEE0, %CALLEE0
jmp .L0
.L1
jmp %RET, %0
|
Example with functions puts, readuint and printuint
In general our functions need more than just two registers. Until we don't know how to use a stack to save and restore registers that do not belong to a callee we have to provide more registers to the callee.
The following program reads in an integer and then prints it. The callee can now use up to 4 registers. If you use it interactively it looks like that:
1 2 3 4 | theon$ ulm func03
Type in some unsigned integer (that fits in 64 bits): 1234567890123
You typed in: 1234567890123
bye!
|
When I create this web pages automatically (which has the advantage that you can be sure that what you see in the shell box below is no fake) I can not type in numbers. So I generate the input by piping the output of an echo like here:
theon$ ulmas -o func03 func03.s theon$ echo 1234567890123 | ulm func03 Type in some unsigned integer (that fits in 64 bits): You typed in: 1234567890123 bye! theon$
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 | # Example for realizing leaf functions (recursive calls will fail!)
# registers that a callee can use
.equ CALLEE0, 1
.equ CALLEE1, CALLEE0 + 1
.equ CALLEE2, CALLEE1 + 1
.equ CALLEE3, CALLEE2 + 1
# all other registers belong to the caller
.equ CALL, CALLEE3 + 1
.equ RET, CALL + 1
.equ N, RET + 1
.data
str0 .string "Type in some unsigned integer (that fits in 64 bits): "
str1 .string "You typed in: "
str2 .string "\nbye!\n"
.text
/*
Entry point of the program
*/
# puts(str0);
ldzwq puts, %CALL
ldzwq str0, %CALLEE0
jmp %CALL, %RET
# readuint();
ldzwq readuint, %CALL
jmp %CALL, %RET
addq %CALLEE0, %0, %N
# puts(str1);
ldzwq puts, %CALL
ldzwq str1, %CALLEE0
jmp %CALL, %RET
# printuint();
ldzwq printuint, %CALL
addq %N, %0, %CALLEE0
jmp %CALL, %RET
# puts(str2);
ldzwq puts, %CALL
ldzwq str2, %CALLEE0
jmp %CALL, %RET
/*
Exit point
*/
halt 0
/*
void
puts(const char *);
*/
puts:
.L0 movzbq (%CALLEE0), %CALLEE1
subq 0, %CALLEE1, %0
jz .L1
putc %CALLEE1
addq 1, %CALLEE0, %CALLEE0
jmp .L0
.L1 jmp %RET, %0
/*
uint64_t
readuint();
*/
readuint:
ldzwq 0, %CALLEE0
.L2 getc %CALLEE1
subq '\n', %CALLEE1, %0
jz .L3
subq '0', %CALLEE1, %CALLEE1
imulq 10, %CALLEE0, %CALLEE0
addq %CALLEE1, %CALLEE0, %CALLEE0
jmp .L2
.L3 jmp %RET, %0
/*
void
printuint(uint64_t);
*/
.bss
.L4 .space 21
.text
printuint:
ldzwq .L4, %CALLEE3
ldzwq 0, %CALLEE1
.L5 divq 10, %CALLEE0, %CALLEE0
addq '0', %CALLEE2, %CALLEE2
addq 1, %CALLEE3, %CALLEE3
movb %CALLEE2, (%CALLEE3)
subq 0, %CALLEE0, %0
jnz .L5
.L6 movzbq (%CALLEE3), %CALLEE2
subq 0, %CALLEE2, %0
jz .L7
putc %CALLEE2
subq 1, %CALLEE3, %CALLEE3
jmp .L6
.L7 jmp %RET, %0
|