===================================================== Calling convention, frame pointer and local variables [TOC] ===================================================== The calling convention for subprograms (so this will include procedures and functions) will allow that a subprograms can freely use registers `%4`, ..., `%255`. So no more "perfect roomer", when you call a function you have to expect that these registers where modified. Hence, if values stored in registers are still needed after a function call you have to _save_ them before the call and _restore_ them after the call, and you have to use the memory for that. Except for storing the return address on the stack, so far memory was only used for global variables (stored in either the data segment or the BSS segment). However, it would not be feasible to use global variables for saving registers. Each global variable requires a unique label, so as soon as the number of subprograms grows you would end up in name conflicts and unmanageable pieces of software. The rule of thumb is to use global variable only when you have a good reason for it. We will use for example global variables to communicate with subprograms until we have functions that can receive arguments and can return a value. This problem can be avoided by using the stack to store variables, these variables are then denoted as _local variables_. When a function needs local variables it reserves sufficient space on the stack by decrementing the stack pointer and releases the memory before the return. This gets done in the prologue and epilogue of the function. The advantage of this is that the memory region used for local variables is bound to the life span of a function call. After a function has done it's job (i.e. the function returned) the memory can be reused. For giving you an idea how the concepts of global and local variables are expressed (and the technical details hidden) in C, the following code fragment has a global variable `global` and a subprogram `foo` with a local variable `local`: ---- CODE (type=c) ------------------------------------------------------------- int64_t global; void foo(void) { int64_t local; // implementation of foo } -------------------------------------------------------------------------------- Using C code as pseudo code allows to show how subprograms can be used for doing actually something useful. For example, the following subprogram `factorial` can be used to compute the factorial of an unsigned integer recursively: ---- CODE (type=c) ------------------------------------------------------------- int64_t arg; void factorial(void) { int64_t n; n = arg; if (n==0) { arg = 1; } else { arg = arg - 1; factorial(); arg = n* arg; } } -------------------------------------------------------------------------------- In this code a global variable `arg` is used to pass an argument to the subprogram and to receive the result from the subprogram. Basic idea (without the gory details) for local variables on the stack ====================================================================== Again we first leave out some of the gory details. Let's assume that a caller already did push the return address on the stack. So when the function gets called the stack looks like that: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.48} \DrawMemArrayOpen{-48}{-1} \DrawMemVariable[red!40]{-24}{0}{Used} \DrawMemVariable[white]{-48}{-24}{Not used} \DrawPointer{-24}{\%SP} \end{tikzpicture} -------------------------------------------------------------------------------- Further assume that the function has two local variables `a` and `b` (both with a size of a quad word). Then the function decrements in the prologue the stack pointer by 16, for using 16 bytes at the top of the stack: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.48} \DrawMemArrayOpen{-48}{-1} \DrawMemVariable[red!40]{-24}{0}{Used} \DrawMemVariable[gray!40]{-32}{-24}{Locale variable a} \DrawMemVariable[gray!40]{-40}{-32}{Locale variable b} \DrawMemVariable[white]{-48}{-40}{Not used} %\DrawPointer{-24}{\%FP} \DrawPointer{-40}{\%SP} \end{tikzpicture} -------------------------------------------------------------------------------- When this function calls another function these local variables are protected, because they are on the stack. In the epilogue of the function the stack pointer gets incremented by 16, and hence the memory for these variables can be reused afterwards. So after the return the stack looks like that: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.48} \DrawMemArrayOpen{-48}{-1} \DrawMemVariable[red!40]{-24}{0}{Used} \DrawMemVariable[gray!40]{-32}{-24}{Not used} \DrawMemVariable[gray!40]{-40}{-32}{Not used} \DrawMemVariable[white]{-48}{-40}{Not used} %\DrawPointer{-24}{\%FP} \DrawPointer{-24}{\%SP} \end{tikzpicture} -------------------------------------------------------------------------------- Again it is worth mentioning that removing elements from our stack does not "cleanup" memory in the sense of zeroing out bytes. We just move the stack pointer that indicates the memory region is not free to use. Why we want a frame pointer =========================== Reserving space for local variables is done in the functions prologue, and releasing it in the epilogue. And both have to match, i.e. when you reserve 16 bytes you have to release 16 bytes. It is possible to program that correctly, but assume you change the implementation of the function because you want another local variable. In such a case you have to change both, the prologue and epilogue, and the need to change two things that have to match can be an annoying source for careless errors. Ideally you can always use the same prologue and always the same epilogue for functions. Then you can use both by copy and paste (or hide them behind some macro that gets expanded by some preprocessor that gets called before the assembler sees the code). The next best thing is that only the prologue needs to be adopted for each function but the epilogue is always the same. And this can be achieve by using a _frame pointer_. For that in the calling convention another register `%FP` will be reserved. When a function gets called the original stack pointer becomes the frame pointer, and then the stack pointer gets decremented if local variables are needed. So during the function call the memory region used for local variables is "framed" by the stack pointer `%SP` and frame pointer `%FP`: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.48} \DrawMemArrayOpen{-48}{-1} \DrawMemVariable[red!40]{-24}{0}{Used} \DrawMemVariable[gray!40]{-32}{-24}{local variable a} \DrawMemVariable[gray!40]{-40}{-32}{local variable b} \DrawMemVariable[white]{-48}{-40}{Not used} \DrawPointer{-24}{\%FP} \DrawPointer{-40}{\%SP} \end{tikzpicture} -------------------------------------------------------------------------------- The prologue and epilogue get adapted such that after a function returns the stack and frame pointer are as before, so for the caller it seems that nothing changed in that respect. This is is achieved as follows: - The prologue consists of 3 instructions if no local variables are needed, and otherwise 4 instructions: - save the return address `%RET` on the stack (as before), - save the original frame pointer `%FP` on the stack, - the frame pointer `%FP` saves the original stack pointer and - if local variables are needed decrement stack pointer `%SP`. - The epilogue before the return instruction always consists of 2 instructions (independent of how many local variables are used): - restore the original stack pointer and - restore the original frame pointer. Now the gory details are about where on the stack the stack pointer and frame pointer are stored. Our protocol will specify that the caller has to reserve space on the stack so that the callee can save two registers, the return register and the frame pointer register. Calling convention: The gory details ==================================== Again I will first write down the details of the calling convention and the show for an example how things work out. Reserved registers ------------------ Three registers are used for the calling convention, and there won't be any further changes to support procedures and functions: ---- CODE (type=s) ------------------------------------------------------------- .equ FP, 1 .equ SP, 2 .equ RET, 3 -------------------------------------------------------------------------------- Register `%FP` for the frame pointer, register `%SP` for the stack pointer, and register `%RET` for the return address. Calling a function ------------------ The essential pattern for calling a subprogram is this: ---- CODE (type=s) ------------------------------------------------------------- subq 16, %SP, %SP # provide space on stack for callee /* Load the address of the function in a register %CALL. */ jmp %CALL, %RET addq 16, %SP, %SP # restore old stack state -------------------------------------------------------------------------------- This means the caller reserves 16 bytes for the callee on the stack for storing the return address (as before) and the frame pointer (that is new). The format of the provided space is ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.48} \DrawMemArrayOpen{-48}{-1} \DrawMemVariable[red!40]{-8}{0}{Used} \DrawQuadVariable[cyan!40]{-24}{reserved for callee} \DrawQuadVariable[cyan!40]{-16}{reserved for callee} \DrawMemVariable[gray!20]{-48}{-24}{Can be used for locale variables} \DrawPointer{-24}{\%SP} \end{tikzpicture} -------------------------------------------------------------------------------- For the more general case of procedures and functions this will be adapted (the stack will also be used to pass arguments and for receiving results). Implementing a function: Prologue and Epilogue ---------------------------------------------- Every function has the following structure ---- CODE (type=s) ------------------------------------------------------------- function_name: /* Function prologue */ /* Implementation of the function */ /* Functions epilogue */ jmp %RET, %0 -------------------------------------------------------------------------------- This will be general enough for also supporting procedures and functions. The 16 bytes reserved at the top of the stack pointer. i.e. at address `%SP` are used in the _prologue_ and _epilogue_ for saving the return address and the frame pointer. The format for these 16 bytes can be described by ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.48} \DrawMemArrayOpen{-48}{-1} \DrawMemVariable[red!40]{-8}{0}{Used} \DrawQuadVariable[cyan!40]{-24}{reserved for \%RET} \DrawQuadVariable[cyan!40]{-16}{reserved for \%FP} \DrawMemVariable[gray!20]{-40}{-24}{Will be used for locale variables} \DrawPointer{-24}{\%SP} \end{tikzpicture} -------------------------------------------------------------------------------- Prologue ~~~~~~~~ As described above, when a function gets called the return address and original frame pointer gets saved first. Then the frame pointer marks the original top of the stack and eventually the stack pointer gets decremented for local variables: ---- CODE(type=s) -------------------------------------------------------------- movq %RET, (%SP) // save the return address movq %FP, 8(%SP) // save original frame pointer addq 0, %SP, %FP // frame pointer is original stack pointer /* One more instruction here if local variables are needed: - decrement %SP for local variables - note that %SP needs to be aligned to 8 bytes! So decrement %SP by the needed size rounded up to the next multiple of 8. */ -------------------------------------------------------------------------------- At the moment you can ignore the details about the alignment requirement of the stack pointer. In all examples the stack pointer will be decremented by a multiple of 8. As the empty stack is zero initialized the stack pointer therefore will always be aligned to 8 bytes. After the prologue (and for the actual function implementation) the stack can be described by --- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.48} \DrawMemArrayOpen{-48}{-1} \DrawMemVariable[red!40]{-8}{0}{Used} \DrawQuadVariable[cyan!40]{-24}{saved \%RET} \DrawQuadVariable[cyan!40]{-16}{saved \%FP} \DrawMemVariable[gray!40]{-40}{-24}{Used for locale variables} \DrawPointer{-40}{\%SP} \DrawPointer{-24}{\%FP} \end{tikzpicture} -------------------------------------------------------------------------------- This comment "note that `%SP` needs to ..." and the following is for future references: In general the amount of bytes needed for local data needs to be rounded up to the next multiple of 8, and by this amount the stack pointer needs to be decremented. Otherwise a called function can not use the stack for data that needs alignment. Epilogue ~~~~~~~~ The epilogue guarantees that after the return the caller has the same stack as before, i.e. it restores the original stack and frame pointer. The last instruction loads the return address into the return register. ---- CODE(type=s) -------------------------------------------------------------- addq 0, %FP, %SP // restore original stack pointer movq 8(%SP),%FP // restore original frame pointer movq 0(%SP),%RET -------------------------------------------------------------------------------- Some working example ==================== - As before the code block with label `_start` gets called first when the program gets started. Here the stack get initialized and then the `main` subprogram gets called. After `main` returns the program is halted. - Subprogram `main` saves some register in a local variable, calls function `funcA`. After the function returned the register gets restored. - Subprogram `funcA` just modifies some register and returns. ---- SHELL (path=session10/subprog) -------------------------------------------- ulmas -o subprog_with_fp subprog_with_fp.s ulm subprog_with_fp -------------------------------------------------------------------------------- :import: session10/subprog/subprog_with_fp.s