Calling convention, frame pointer and local variables

The calling convention for subprograms (so this will include procedures and functions) will allow that a subprograms can freely use registers %4, ..., %255. So no more “perfect roomer”, when you call a function you have to expect that these registers where modified.

Hence, if values stored in registers are still needed after a function call you have to save them before the call and restore them after the call, and you have to use the memory for that. Except for storing the return address on the stack, so far memory was only used for global variables (stored in either the data segment or the BSS segment). However, it would not be feasible to use global variables for saving registers. Each global variable requires a unique label, so as soon as the number of subprograms grows you would end up in name conflicts and unmanageable pieces of software. The rule of thumb is to use global variable only when you have a good reason for it. We will use for example global variables to communicate with subprograms until we have functions that can receive arguments and can return a value.

This problem can be avoided by using the stack to store variables, these variables are then denoted as local variables. When a function needs local variables it reserves sufficient space on the stack by decrementing the stack pointer and releases the memory before the return. This gets done in the prologue and epilogue of the function. The advantage of this is that the memory region used for local variables is bound to the life span of a function call. After a function has done it's job (i.e. the function returned) the memory can be reused.

For giving you an idea how the concepts of global and local variables are expressed (and the technical details hidden) in C, the following code fragment has a global variable global and a subprogram foo with a local variable local:

1
2
3
4
5
6
7
8
9
int64_t global;

void
foo(void)
{
    int64_t local;

    // implementation of foo
}

Using C code as pseudo code allows to show how subprograms can be used for doing actually something useful. For example, the following subprogram factorial can be used to compute the factorial of an unsigned integer recursively:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
int64_t arg;

void
factorial(void)
{
    int64_t n;

    n = arg;
    if (n==0) {
        arg = 1;
    } else {
        arg = arg - 1;
        factorial();
        arg = n* arg;
    }
}

In this code a global variable arg is used to pass an argument to the subprogram and to receive the result from the subprogram.

Basic idea (without the gory details) for local variables on the stack

Again we first leave out some of the gory details. Let's assume that a caller already did push the return address on the stack. So when the function gets called the stack looks like that:

Further assume that the function has two local variables a and b (both with a size of a quad word). Then the function decrements in the prologue the stack pointer by 16, for using 16 bytes at the top of the stack:

When this function calls another function these local variables are protected, because they are on the stack. In the epilogue of the function the stack pointer gets incremented by 16, and hence the memory for these variables can be reused afterwards. So after the return the stack looks like that:

Again it is worth mentioning that removing elements from our stack does not “cleanup” memory in the sense of zeroing out bytes. We just move the stack pointer that indicates the memory region is not free to use.

Why we want a frame pointer

Reserving space for local variables is done in the functions prologue, and releasing it in the epilogue. And both have to match, i.e. when you reserve 16 bytes you have to release 16 bytes. It is possible to program that correctly, but assume you change the implementation of the function because you want another local variable. In such a case you have to change both, the prologue and epilogue, and the need to change two things that have to match can be an annoying source for careless errors.

Ideally you can always use the same prologue and always the same epilogue for functions. Then you can use both by copy and paste (or hide them behind some macro that gets expanded by some preprocessor that gets called before the assembler sees the code). The next best thing is that only the prologue needs to be adopted for each function but the epilogue is always the same. And this can be achieve by using a frame pointer. For that in the calling convention another register %FP will be reserved. When a function gets called the original stack pointer becomes the frame pointer, and then the stack pointer gets decremented if local variables are needed. So during the function call the memory region used for local variables is “framed” by the stack pointer %SP and frame pointer %FP:

The prologue and epilogue get adapted such that after a function returns the stack and frame pointer are as before, so for the caller it seems that nothing changed in that respect. This is is achieved as follows:

  • The prologue consists of 3 instructions if no local variables are needed, and otherwise 4 instructions:

    • save the return address %RET on the stack (as before),

    • save the original frame pointer %FP on the stack,

    • the frame pointer %FP saves the original stack pointer and

    • if local variables are needed decrement stack pointer %SP.

  • The epilogue before the return instruction always consists of 2 instructions (independent of how many local variables are used):

    • restore the original stack pointer and

    • restore the original frame pointer.

Now the gory details are about where on the stack the stack pointer and frame pointer are stored. Our protocol will specify that the caller has to reserve space on the stack so that the callee can save two registers, the return register and the frame pointer register.

Calling convention: The gory details

Again I will first write down the details of the calling convention and the show for an example how things work out.

Reserved registers

Three registers are used for the calling convention, and there won't be any further changes to support procedures and functions:

1
2
3
    .equ    FP,     1
    .equ    SP,     2
    .equ    RET,    3

Register %FP for the frame pointer, register %SP for the stack pointer, and register %RET for the return address.

Calling a function

The essential pattern for calling a subprogram is this:

1
2
3
4
5
6
    subq    16,     %SP,    %SP             # provide space on stack for callee
    /*
        Load the address of the function in a register %CALL.
    */
    jmp     %CALL,  %RET
    addq    16,     %SP,    %SP             # restore old stack state

This means the caller reserves 16 bytes for the callee on the stack for storing the return address (as before) and the frame pointer (that is new). The format of the provided space is

For the more general case of procedures and functions this will be adapted (the stack will also be used to pass arguments and for receiving results).

Implementing a function: Prologue and Epilogue

Every function has the following structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
function_name:
    /*
        Function prologue
    */

    /*
        Implementation of the function
    */

    /*
        Functions epilogue
    */
    jmp %RET,   %0

This will be general enough for also supporting procedures and functions.

The 16 bytes reserved at the top of the stack pointer. i.e. at address %SP are used in the prologue and epilogue for saving the return address and the frame pointer. The format for these 16 bytes can be described by

Prologue

As described above, when a function gets called the return address and original frame pointer gets saved first. Then the frame pointer marks the original top of the stack and eventually the stack pointer gets decremented for local variables:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
        movq %RET,  (%SP)           // save the return address 
        movq %FP,   8(%SP)          // save original frame pointer
        addq 0,     %SP,    %FP     // frame pointer is original stack pointer 
        /*
            One more instruction here if local variables are needed:
            - decrement %SP for local variables
            - note that %SP needs to be aligned to 8 bytes!
            So decrement %SP by the needed size rounded up to the next multiple
            of 8.
        */

At the moment you can ignore the details about the alignment requirement of the stack pointer. In all examples the stack pointer will be decremented by a multiple of 8. As the empty stack is zero initialized the stack pointer therefore will always be aligned to 8 bytes.

After the prologue (and for the actual function implementation) the stack can be described by

This comment “note that %SP needs to ...” and the following is for future references: In general the amount of bytes needed for local data needs to be rounded up to the next multiple of 8, and by this amount the stack pointer needs to be decremented. Otherwise a called function can not use the stack for data that needs alignment.

Epilogue

The epilogue guarantees that after the return the caller has the same stack as before, i.e. it restores the original stack and frame pointer. The last instruction loads the return address into the return register.

1
2
3
        addq 0,     %FP,    %SP     // restore original stack pointer
        movq 8(%SP),%FP             // restore original frame pointer
        movq 0(%SP),%RET

Some working example

  • As before the code block with label _start gets called first when the program gets started. Here the stack get initialized and then the main subprogram gets called. After main returns the program is halted.

  • Subprogram main saves some register in a local variable, calls function funcA. After the function returned the register gets restored.

  • Subprogram funcA just modifies some register and returns.

theon$ ulmas -o subprog_with_fp subprog_with_fp.s
theon$ ulm subprog_with_fp
AM
theon$ 
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
        #NOTE: relevant for the function call convention
        .equ    FP,     1
        .equ    SP,     2
        .equ    RET,    3

        .text
/*
        Entry point
*/
_start:
        #NOTE: relevant for the convention
        ldzwq   0,              %SP

        subq    16,             %SP,            %SP
        ldzwq   main,           %4
        jmp     %4,             %RET
        addq    16,             %SP,            %SP

/*
        (only) Exit point
*/
        halt    0

//------------------------------------------------------------------------------
// PROGRAM main
//------------------------------------------------------------------------------

        .text
main:
        // save what needs to be restored and make space for local variables
        movq    %RET,           0(%SP)
        movq    %FP,            8(%SP)
        addq    0,              %SP,            %FP
        subq    8,              %SP,            %SP     # 8 bytes for locals 

        // begin of the function body
        ldzwq   'M',            %5
        movq    %5,             -8(%FP)  // save 'M' in local variable

        subq    16,     %SP,    %SP     // function call: funcA()
        ldzwq   funcA,  %4
        jmp     %4,     %RET
        addq    16,     %SP,    %SP

        movq    -8(%FP),        %5      // restore 'M' from local variable
        putc    %5
        putc    '\n'                    // for convenience
        // end of the function body

        // restore stack and what was saved and return
        addq    0,              %FP,            %SP
        movq    8(%SP),         %FP
        movq    0(%SP),         %RET
        jmp     %RET,           %0

//------------------------------------------------------------------------------
// SUBPROGRAMS 
//------------------------------------------------------------------------------

/*
        void
        funcA(void);
*/
        .text
funcA:
        // save what needs to be restored
        movq    %RET,           0(%SP)
        movq    %FP,            8(%SP)
        subq    0,              %SP,            %FP

        // begin of the function body
        
        ldzwq   '!',            %5
        putc    'A'

        // end of the function body

        // restore what was saved and return
        addq    0,              %FP,            %SP
        movq    8(%SP),         %FP
        movq    0(%SP),         %RET
        jmp     %RET,           %0