Proof of concept: Using a stack for function calls

We now investigate how a stack can be used in principal for functions calls with a (potentially) arbitrary deep call tree. That means a function can call a function which calls a functions etc. The only limitation will be the size of the memory.

For this proof of concept we define a calling convention that not really allows to implement functions that can do anything useful. But this minimalistic convention will be extended, so what you learn here will not just be some theoretical nonsense.

Working example

The flow chart below illustrates some of the basic ideas that will be realized by the calling convention in this session. It also outlines the general structure that all programs will have from now on:

  • Each program consists of functions, and you can have as many as you want (they just have to fit into memory),

  • each program has at least one function main,

  • when a program starts some code block initializes the stack, calls function main and then halts the ULM.

Using the ULM C compiler later allows us to write the function is a more abstract way. But the compiler just generates the assembly code that we have to write here by ourself.

The code block for initializing the stack will be the only part that needs to be hand coded in assembly. And it has to be ensured that the entry point of the program is the first instruction of this code block.

Calling convention without the gory details

The roles of caller and callee are no longer mutually exclusive. When a function calls another function the role switches from callee to caller. And every function switches back to the role of a callee before it returns.

So when we talk here about a caller we mean the situation right before a function call. And when we describe the role of the callee we describe the situation right before a function return. Therefore we just have to specify two things here:

  • How to call a function: Push the return address onto the stack and jump to the function.

  • How to return: Pop the return address from the stack and jump to the return address.

And on the Intel architecture (which is a CISC architecture) this is directly supported by instructions:

  • call foo would push the return address on the stack and jump to address foo, and

  • ret pops a return address and jumps to it.

These instructions make it simple to write and use functions, although you still have to be sure that call and ret work perfectly together. Because ret just pops whatever value is on top of the stack, that means whatever value was pushed last.

On RISK architectures (like the ULM or ARM) such things have to be done “more manually”, that means we have to do what the CISC architectures is hiding with such instructions. So that is why we have to deal with some gory details.

The gory details that we will avoid

It would be insane (i.e inefficient) to imitate the call instruction from the Intel architecture on the ULM. Neither the instruction set not the assembler provides you much help for computing the return address before you jump to a function. The best thing for getting the return address before the jump could be achieved by placing a label after the jump instruction:

1
2
3
    // ....
    jmp     %CALL,      %0              # jump to function
ret                                     # label for return address

Then we could actually load the return address before the jump:

1
2
3
4
5
6
    ldzwq   func,       %CALL           # load call address
    ldzwq   ret,        %VAL            # load return address
    subq    8,          %SP,    %SP     # push return address
    movq    %VAL,       (%SP)
    jmp     %CALL,      %0              # jump to function
ret

But why the hell is it so important to have the return address before the jump? It is not. The ULM provides the jmp %CALL, %RET instruction, so you have the return address after the jump for free. No need to waste instructions for loading or computing the return address in advance.

The gory details that we happily will (learn to) live with

Let me first show you how we achieve what matters and then explain why it is basically the same as Intel's call and ret:

Reserved registers

Two registers are used for the calling convention here:

1
2
    .equ    SP,     2
    .equ    RET,    3

Register %SP for the stack and register %RET for the return address.

Mechanism for calling a function

The essential pattern for calling a function is this:

1
2
3
4
5
6
    subq    8,      %SP,    %SP             # provide space on stack for callee
    /*
        Load the address of the function in a register %CALL.
    */
    jmp     %CALL,  %RET
    addq    8,      %SP,    %SP             # restore old stack state

Obviously the stack pointer gets manipulated but nothing gets copied to memory. You will see that we basically do half of a push operation. Also note that you can use any register (except %SP, %RET and obviously %0) for storing the address of the function address.

How to implement a function

The essential pattern for writing a function is

1
2
3
4
5
6
7
8
9
function_name:
      movq      %RET,           (%SP)       # function prologue: save %RET
      // begin of the function body

      /* Your implementation of the function */

      // end of the function body
      movq      (%SP),  %RET                # function epilogue: restore %RET
      jmp       %RET,   %0

Important here are the two lines with the comments function prologue and function epilogue. Every function has to begin with the prologue movq %RET, (%SP) and has to end with the epilogue movq (%SP), %RET followed by the jump to the return address.

Applying the calling convention to the working example

Usually we will blend out the details of the calling conventions when we describe a program. But for now, let's blend in the convention for the working example from above. Just to get a feeling what needs to be done in a mechanical, boring way that actually something like a compiler could do for us:

How does our calling convention work?

We could describe the convention as follows:

  • Pushing the return address is not an atomic operation. The first half of the operation is done before the jump and the second half by the function prologue.

  • Popping the return address is not an atomic operation. The first half of the operation is done by the function epilogue and the second half by the caller after the function returned.

Before this “mechanism for calling a function” gets triggered the stack can be described by:

The concrete address of the stack does not matter. What matters is that at higher addresses (memory cells at the right-hand side of %SP) are used, i.e. contain data that is saved on the stack.

Caller does “first half of pushing the return address”

The relevant part done by the caller for pushing the return address is done by

1
    subq    8,      %SP,    %SP

This provides the callee space on the stack to save its return address. Now the stack can be described by

The next two instructions have nothing to do with the stack. The caller just loads the functions address and jumps to the function:

1
2
    ldzwq   foo,    %CALL    # you can use any register for CALL except SP
    jmp     %CALL,  %RET

But after the jump the return address is stored in %RET and the callee takes over.

Callee prologue does “second half of pushing the return address”

As the callee knows that the return address is in register %RET, it can complete the push operation with

1
    movq    %RET,   (%SP)

So here we go, the return address is at as safe place, at the right-hand side of the stack pointer:

Now the callee can do something useful

After the prologue the callee can start to do it's job. Even calling a function!

Callee epilogue does “first half of popping the return address”

When the callee is done with its job the epilogue

1
    movq    (%SP),  %RET

does the first half of the pop operation, i.e. copying the return address into %RET.

Then the callee can return with

1
    jmp     %RET,   %0

Note that neither the prologue nor epilogue did modify the stack pointer. Both were just storing or fetching data from the top of the stack.

Caller does “seconf half of popping the return address”

After the function returned the caller restores the stack with

1
    subq    8,      %SP,    %SP

so we have the same stack as before (with trash on the left-hand-side of %SP):

Example Code with the bloody details

This code implements the working example according to the above pattern:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
        #NOTE: relevant for the function call convention
        .equ    SP,     2
        .equ    RET,    3

        .text
/*
        Entry point
*/
_start:
        #NOTE: relevant for the convention
        ldzwq   0,              %SP

        subq    8,      %SP,    %SP
        ldzwq   main,           %4
        jmp     %4,             %RET
        addq    8,      %SP,    %SP

/*
        (only) Exit point
*/
        halt    0

//------------------------------------------------------------------------------
// PROGRAM main
//------------------------------------------------------------------------------

        .text
main:
        // save what needs to be restored
        movq    %RET,           (%SP)

        // begin of the function body
        putc    'A'

        //      function call: funcA();
        subq    8,      %SP,    %SP
        ldzwq   funcA,  %4
        jmp     %4,     %RET
        addq    8,      %SP,    %SP

        putc    'B'
        putc    '\n'
        // end of the function body

        // restore what was saved and return
        movq    (%SP),  %RET
        jmp     %RET,   %0

//------------------------------------------------------------------------------
// SUBPROGRAMS 
//------------------------------------------------------------------------------

/*
        void
        funcA(void);
*/
        .text
funcA:
        // save what needs to be restored
        movq    %RET,   (%SP)

        // begin of the function body
        
        putc    'C'

        // end of the function body

        // restore what was saved and return
        movq    (%SP),  %RET
        jmp     %RET,   %0

Translate and run it as usual:

theon$ ulmas -o subprog_without_fp  subprog_without_fp.s
theon$ ulm subprog_without_fp
ACB
theon$ 

Notes about using symbols for registers

Note that symbols are used in the code only for those registers that are relevant for the calling convention, i.e. for %SP and %RET. For other registers plain literals are used, e.g. %4. Our new function call convention will allow it that registers %4, ..., %255 can be used freely in every function. They no longer have a special meaning. So in one function you can use %4 for calling a function and in another %5 etc.

Btw: What does “... without fp” mean?

“fp” is short for frame pointer. So “subprogram without frame ppointer” indicates that some extension is coming up ;-)

Quiz08

For this quiz you have to hand in two files: call_stack.s and call_stack.txt.

On theon submit your program with the command

submit hpc quiz08 call_stack.s call_stack.txt

Your answers will not be evaluated automatically but you should get a confirmation email after your submit.

About call_stack.s

Write a program call_stack.s with functions main, funcA, funcB and funcC. These functions are supposed to do the following:

  • Function main prints the character M, calls function funcA, prints m, prints a newline and then returns.

  • Function funcA prints the character A, calls function funcB, prints ,, calls function funcC, prints a and then returns.

  • Function funcB prints the character B and then returns.

  • Function funcC prints the character C, calls function funcB, prints c, calls function funcB and then returns.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
        #NOTE: relevant for the function call convention
        .equ    SP,     2
        .equ    RET,    3

        .text
/*
        Entry point
*/
_start:
        #NOTE: relevant for the convention
        ldzwq   0,              %SP

        subq    8,      %SP,    %SP
        ldzwq   main,           %4
        jmp     %4,             %RET
        addq    8,      %SP,    %SP

/*
        (only) Exit point
*/
        halt    0

//------------------------------------------------------------------------------
// PROGRAM main
//------------------------------------------------------------------------------

About call_stack.txt

It is important that your program “blindly follows the rules”. You are supposed to work here like a compiler! But if you have an idea how the implementation of some of these functions could be simplified then write down your thought in call_stack.txt. Maybe there is even a simple rule in which case your idea always works.