Stack data structure
For function calls we will use a stack for storing return addresses, arguments and local variables. In general a stack is an abstract data type that provides two operations
-
push for adding an element, and
-
pop for removing the most recently added element.
What we need is a stack data structure, i.e. an implementation of this abstract concept that can be used later for function calls. On the Intel architecture instructions like popq and pushq are provided for this operations, and registers like %rsp and %rbp are dedicated for the stack data structure. Per se the ULM architecture does not provide such dedicated instructions for stack operations and neither dedicated registers.
So it is up us, the programmers, to implement these operations, agree upon which register are dedicated to the stack implementation, and we even have to decide (or freely choose) which memory region gets used for the stack.
Error handling
Error handling for the stack is a luxury that we do not provide. Our stack implementation has to be efficient and therefore it has to be as simple as possible. If something goes wrong it is either the fault of the programmer or we ran into a scenario that we considered for efficiency concern as “sufficiently unlikely” (and can not be handled if they happen after all).
Programmers fault: Popping data from an empty stack
Things will go south if more data gets removed from the stack than was added. This easily can be avoided by making sure that push and pop operations only come in pairs. That's a simple rule, hence not our fault if a programmer does not follow it.
Programmers fault: Bugs in using the push and pop operation
As mentioned above, unlike the Intel instruction set the ULM does not provide builtin push and pop instruction. But there will be a recipe that in a “don't think just copy-and-paste these lines for push and these for pop” style. That's a simple rule, hence not our fault if a programmer does not follow it.
The “sufficiently unlikely” case
When we push something onto the stack we have to find memory space that can be overwritten. We should not overwrite memory that is used by the loaded program (i.e. the text, data and BSS segment are taboo). We will use a simple that “most likely always” finds some free memory for a push and releases it after a pop. In the unlikely event of running out of memory we will have a stack overflow. And in that case our program will not crash, and that is a problem, it will continue run on with undefined behavior.
Fetch and store instructions for the ULM
The ULM instruction set provides fetch instructions for copying data from the memory into a register, and store instruction to copy data from a register to memory. What we use here is movq (%Y), %Z with Opcode 0x18 for fetching, and movq %X, (%Z) with Opcode 0x28 for storing. As you see in the description of these instructions there is a so called alignment restriction, the address has to be a multiple of eight. In this stack example this will always be the case. So you can ignore it for now, I will dedicate an extra page in this session to this topic.
Conventions for the stack
By convention register 2 is dedicated to the stack data structure and for convenience we assume an directive like
1 | .equ SP, 2
|
is used, but it is not required. The reason for choosing register 2 is arbitrary, as we have to choose something, and therefore does not require any justification. But I can give some explanation: choosing register 0 would be unwise and register 1 will have some special meaning in the next stack convention which is also used by the ULM C compiler.
Initializing the stack
The stack gets initialized by setting %SP to zero. Because the memory has \(2^{64}\) bytes and registers \(64\) bits, zero is the address that points one-past-the-last memory cell:
Meaning of the stack pointer
If %SP is zero it means the stack is empty. That's because by convention only memory cells with an address greater or equal to %SP contain data that is stored on the stack. That means “memory on the right-hand-side of %SP” is used by the stack:
If the stack pointer is not initialized then pop and push operations will have an unpredictable behaviour (for teaching purposes the ULM initializes registers with random values before it executes a program). So we have to be sure that an instruction like
1 | ldzwq 0, %SP # initialize empty stack
|
gets executed before any push or pop operation is done.
Push operation
The push operation is implemented by decrementing the stack pointer and then copying data to the stack pointer. That means the stack grows downwards after each push:
Pushing a quad word from a register %VAL onto the stack can be implemented by
1 2 3 | # push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
|
Here %VAL can be an arbitrary register. Note that the ULM provides no instruction for copying a literal directly into memory. You first have to copy it into a register. For example
1 2 3 4 5 | # load 42
ldzwq 42, %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
|
would push the value 42 (zero extended to 64 bits) onto the stack.
Pop operation
The pop operation copies data from the end of the stack pointer into a register and then increments the stack pointer:
Popping a quad word into a register %VAL can be implemented by
1 2 3 | # push %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
|
It would be insane to “clean up” memory after a pop operation, e.g by zeroing out the freed memory region. Such madness would cost an additional instruction! So stick to the rules: memory on the left-hand-side of %SP is free and memory on the left-hand-side of %SP can contain trash.
Why does our stack grow downwards?
The ULM provides a vast amount of \(2^{64}\) bytes of memory. And so far we made only little use of it. We only use the memory region with small addresses (actually beginning with address ) to load the segments of the program that we run:
When using the stack it should be nearly impossible to overwrite the loaded program. We can achieve that by
-
Either starting the stack at the end of the BSS segment and letting it grow upwards (i.e. growing to higher addresses) after a push
-
or starting the stack at “end of the memory” and letting it grow downwards.
The choice for the later one is simplicity. Register %SP is used as a 64-bit address, i.e. an unsigned 64-bit integer. Therefore address \(2^{64}\) actually means \(u\left(2^{64}\right) \bmod 2^{64} = 0\) and therefore zero initializing %SP means that it always points to “the end of the memory” or one-past-the-last memory cell.
Think about the alternative: Using the address after the BSS program would mean each program would have a different start address for the stack. Certainly that could be realized equally efficient (the assembler has that information) but it would require some extra thought.
Example for using the stack to store some characters
Before using the stack for function calls you should look at a simple example to observe how using the stack works and what happens after a push and pop operation. For this purpose, the following program is supposed to be as simple as possible. It first pushes the characters 'A', 'B' and 'C' onto the stack. Then it pops off and prints these values. While being simple in structure the program has more then just a few lines of code (hey, we are programming in assembly):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | .equ SP, 2
.equ VAL, 68 # just for convenience
.text
/*
initialize stack
*/
ldzwq 0, %SP
/*
push 'A", push 'B', push 'C'
*/
# load value in %VAL
ldzwq 'A', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
# load value in %VAL
ldzwq 'B', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
# load value in %VAL
ldzwq 'C', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
/*
pop and print value (3 times)
*/
# pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
# pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
# pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
/*
TODO: Your bug here
*/
putc '\n'
halt 0
|
Running the program therefore produces the following output:
theon$ ulmas -o stack stack.s theon$ ulm stack CBA theon$
But of course the intension of providing this example is to understand the program line by line. In the following I will visualize how the stack gets affected and you should confirm that by running the program with the ULM debugger ulm-qui. It is in particular to “see the meaning” of what the code does.
Initializing the stack
The first instruction initializes the stack
1 2 3 4 5 | .equ SP, 2 # actually required for the stack
.equ VAL, 5 # just for convenience
.text
ldzwq 0, %SP
|
Note that actually both .equ directives are just for convenience (and a more expressive code). In the ulm-qui you only see that register 2 is now zero initialized, but the meaning of that is the following:
Push character 'A'
As mentioned above we first have to load the character constant into a register. Except for register %SP (and %0 or course) any register can be used. Such a register is hidden behind the symbol VAL (in the example code below defined as 5):
1 2 | # load value in %VAL
ldzwq 'A', %VAL
|
Now the two instructions for the push operation
1 2 3 | # push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
|
will modify the stack pointer and the quad word at the end of the pointer:
Push character 'B'
We follows strictly the same pattern
1 2 3 4 5 | # load value in %VAL
ldzwq 'B', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
|
which results in
Push character 'C'
And analogously
1 2 3 4 5 | # load value in %VAL
ldzwq 'C', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
|
results in
Pop a character and print it
We now apply the pattern for the pop operation and afterwards print the popped value, i.e.
1 2 3 4 5 | # pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
|
Looking at the memory you will see no difference. Only the register for the stack pointer changed. And this has the following meaning:
Pop a character and print it
Again the same pattern is applied to remove and print the next character from the stack:
1 2 3 4 5 | # pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
|
Again, the memory content is not affected but the stack pointer has changed
Pop a character and print it
I guess you see by now the effect of
1 2 3 4 5 | # pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
|
and how to interpret things you observe in the debugger:
Quiz 07 (About the stack)
This quiz consists of two exercise (see below) and you have to hand in 3 files. On theon do that with the command
submit hpc quiz07 understand_the_bug.s understand_the_bug.txt push_64.s
Your answers will not be evaluated automatically but you should get a confirmation email after your submit.
Exercise: understand_the_bug.s and understand_the_bug.txt
This is basically the program form above:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | .equ SP, 2
.equ VAL, 68 # just for convenience
.text
/*
initialize stack
*/
ldzwq 0, %SP
/*
push 'A", push 'B', push 'C'
*/
# load value in %VAL
ldzwq 'A', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
# load value in %VAL
ldzwq 'B', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
# load value in %VAL
ldzwq 'C', %VAL
# push %VAL
subq 8, %SP, %SP
movq %VAL, (%SP)
/*
pop and print value (3 times)
*/
# pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
# pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
# pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
/*
* TODO: Your bug here.
*/
putc '\n'
halt 0
|
With a small modification The program has a comment TODO: Your bug here. Remove that comment with and replace it with
1 2 3 4 5 | # pop %VAL
movq (%SP), %VAL
addq 8, %SP, %SP
# use popped value
putc %VAL
|
Don't change anything else. Now translate it and run the program. You should observe the following:
theon$ ulm understand_the_bug CBAD theon$
I think so far this exercise was simple ;-) So now comes the challenge:
-
Explain why this result could be predicted.
-
Modify at most to characters in this buggy program so that the program prints CBA! followed by a newline.
To be clear: don't fix the bug, play with the bug. We know that the problem is that more elements are removed than previously added to the stack. You will have the opportunity to fix many bugs in the future.
Exercise: push_64.s
Write a program that does the following (in that order):
-
push the literal 0xFFFFFFFFFFFFFFFF on the stack,
-
push the literal 0x1234567890123456 on the stack.
Hint: The purpose of the that is to “motivate” you to read about how in general a 64-bit address can be loaded into a register. Because in general
1 | ldzwq function_label, %CALL
|
is only working if the label refers to an address in the range from 0 to \(2^{16}-1\).