Using global variables to communicate with subprograms
With subprogram we already can program something useful. As a proof of concept I show you a program that reads in an integer, prints some string an then prints the integer:
1 2 3 4 | theon$ ulm subprog_io_stuff
123
You typed: 123
theon$
|
C code can be used as pseudo code to describe our assembly programs. This will reflect what part of the C programming language we currently can implement with assembly code ourself. And vice verse, it gives you an idea of the assembly code the C compiler could generate for you if you use these features of the C programming language.
The program subprog_io_stuff can be described in such a C pseudo code style as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | // for passing an argument and receiving a return value
uint64_t arg;
void
getui()
{
// implementation of subprogram getui
}
void
puts()
{
// implementation of subprogram puts
}
void
putui()
{
// implementation of subprogram putui
}
// some string (array of characters)
char msg[] = "You typed: ";
void
main()
{
// local variable
uint64_t n;
// call subprogram getui, copy return value to local variable
getui();
n = arg;
// copy pointer to string literal to arg, call puts
arg = msg;
puts();
// copy local variable to arg, call putui
arg = n;
putui();
}
|
Exercise: It will actually compile and work
The above code can be slightly modified such that it compiles with gcc and the executable will work, i.e read in an unsigned 64-bit integer, print the string “You typed: ” and then prints the integer followed by a newline.
Apply the following modifications:
-
The type uint64_t is not a builtin type in C. Usually you have to include <stdint.h> for getting the proper type definition for an architecture. Insert the following line to the begin of the source file:
1
typedef unsigned long uint64_t;
This is the proper type definition for theon.
-
Insert implementations for the subprograms. These will consist of calling functions that are defined in the C library that will be linked after the compilation. In the function body of getui insert the line
1
scanf("%lu", &arg);
this function receives two parameters: a pointer to a (format) string and a pointer to the global variable arg. In the function body of puts insert the line
1
printf("%s", arg);
In the function body of putui insert the line
1
printf("%lu", arg);
After these modifications the code should compile (bravely ignore a few warnings):
theon$ gcc -o subprog_io_stuff subprog_io_stuff_gcc.c subprog_io_stuff_gcc.c: In function 'getui': subprog_io_stuff_gcc.c:11:5: warning: implicit declaration of function 'scanf' [-Wimplicit-function-declaration] scanf("%lu", &arg); ^~~~~ subprog_io_stuff_gcc.c:11:5: warning: incompatible implicit declaration of built-in function 'scanf' subprog_io_stuff_gcc.c:11:5: note: include '<stdio.h>' or provide a declaration of 'scanf' subprog_io_stuff_gcc.c: At top level: subprog_io_stuff_gcc.c:15:1: warning: conflicting types for built-in function 'puts' [-Wbuiltin-declaration-mismatch] puts() ^~~~ subprog_io_stuff_gcc.c: In function 'puts': subprog_io_stuff_gcc.c:18:5: warning: implicit declaration of function 'printf' [-Wimplicit-function-declaration] printf("%s", arg); ^~~~~~ subprog_io_stuff_gcc.c:18:5: warning: incompatible implicit declaration of built-in function 'printf' subprog_io_stuff_gcc.c:18:5: note: include '<stdio.h>' or provide a declaration of 'printf' subprog_io_stuff_gcc.c: In function 'putui': subprog_io_stuff_gcc.c:25:5: warning: incompatible implicit declaration of built-in function 'printf' printf("%lu", arg); ^~~~~~ subprog_io_stuff_gcc.c:25:5: note: include '<stdio.h>' or provide a declaration of 'printf' subprog_io_stuff_gcc.c: In function 'main': subprog_io_stuff_gcc.c:41:9: warning: assignment makes integer from pointer without a cast [-Wint-conversion] arg = msg; ^ theon$
Afterwards run the executable subprog_io_stuff.
Exercise: Understanding the assembly implementation for the ULM
The following assembly program is an “equivalent” implementation of the above C program for the ULM. Look at the implementation of main, understand the details that are hidden by the C programming language, e.g copying a global variable to a local variable, calling a subprogram and returning from a subprogram.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 | #NOTE: relevant for the function call convention
.equ FP, 1
.equ SP, 2
.equ RET, 3
.text
/*
Entry point
*/
_start:
#NOTE: relevant for the convention
ldzwq 0, %SP
subq 16, %SP, %SP
ldzwq main, %4
jmp %4, %RET
addq 16, %SP, %SP
/*
(only) Exit point
*/
halt 0
//------------------------------------------------------------------------------
// GLOBAL variable for passing arguments and receiving results
//------------------------------------------------------------------------------
.bss
.align 8
arg: .space 8
//------------------------------------------------------------------------------
// PROGRAM main
//------------------------------------------------------------------------------
.data
msg .string "You typed: "
.text
main:
// Function prologue (with 8 bytes for a local variable)
movq %RET, 0(%SP)
movq %FP, 8(%SP)
addq 0, %SP, %FP
subq 8, %SP, %SP # 8 bytes for local variable n
// begin of the function body
subq 16, %SP, %SP // function call: getui()
ldzwq getui, %4
jmp %4, %RET
addq 16, %SP, %SP
ldzwq arg, %4 // copy global arg to local n
movq (%4), %4
movq %4, -8(%FP)
ldzwq msg, %4 // copy address of string in global arg
ldzwq arg, %5
movq %4, (%5)
subq 16, %SP, %SP // function call: puts
ldzwq puts, %4
jmp %4, %RET
addq 16, %SP, %SP
movq -8(%FP), %4 // copy local n to global arg
ldzwq arg, %5
movq %4, (%5)
subq 16, %SP, %SP // function call: putui
ldzwq putui, %4
jmp %4, %RET
addq 16, %SP, %SP
putc '\n' // for convenience
// end of the function body
// Function epilogue
addq 0, %FP, %SP
movq 8(%SP), %FP
movq 0(%SP), %RET
jmp %RET, %0
//------------------------------------------------------------------------------
// SUBPROGRAMS
//------------------------------------------------------------------------------
/*
void
getui()
{
*/
.text
getui:
// save what needs to be restored
movq %RET, 0(%SP)
movq %FP, 8(%SP)
addq 0, %SP, %FP
// begin of the function body
# We use %4 to store the integer that we read. We begin with %4 = 0
ldzwq 0, %4
# Now attach digits until the user types '\n'
.L2 getc %5
subq '\n', %5, %0
jz .L3
subq '0', %5, %5
imulq 10, %4, %4
addq %5, %4, %4
jmp .L2
.L3
# Copy %4 into global variable getui_val
ldzwq arg, %5
movq %4, (%5)
// end of the function body
// restore what was saved and return
addq 0, %FP, %SP
movq 8(%SP), %FP
movq 0(%SP), %RET
jmp %RET, %0
/*
}
*/
//------------------------------------------------------------------------------
/*
void
puts(void)
{
*/
.text
puts:
// save what needs to be restored
movq %RET, 0(%SP)
movq %FP, 8(%SP)
addq 0, %SP, %FP
// begin of the function body
ldzwq arg, %4
movq (%4), %4
.L0 movzbq (%4), %5
subq 0, %5, %0
jz .L1
putc %5
addq 1, %4, %4
jmp .L0
.L1
// end of the function body
// restore what was saved and return
addq 0, %FP, %SP
movq 8(%SP), %FP
movq 0(%SP), %RET
jmp %RET, %0
/*
}
*/
//------------------------------------------------------------------------------
.bss
putui_buf:
.space 21
/*
void
putui(void)
{
*/
.text
putui:
// save what needs to be restored
movq %RET, 0(%SP)
movq %FP, 8(%SP)
addq 0, %SP, %FP
// begin of the function body
// load global arg
ldzwq arg, %4
movq (%4), %4
// We will use the register pair (%4, %5) for divq. So zero out %5
ldzwq 0, %5
ldzwq putui_buf, %7
.L5 divq 10, %4, %4
addq '0', %6, %6
addq 1, %7, %7
movb %6, (%7)
subq 0, %4, %0
jnz .L5
.L6 movzbq (%7), %6
subq 0, %6, %0
jz .L7
putc %6
subq 1, %7, %7
jmp .L6
.L7
// end of the function body
// restore what was saved and return
addq 0, %FP, %SP
movq 8(%SP), %FP
movq 0(%SP), %RET
jmp %RET, %0
/*
}
*/
|
Copying the global variable to the local variable
Use a picture like that as guide:
The global variable has the label arg which is the address of its memory location. The address of the local variable is %FP - 8. So with
1 2 | ldzwq arg, %4
movq (%4), %4
|
You first load the address of the global variable and then the global variable itself into register %4. From there you can copy the value of the global variable to the memory location of the local variable with
1 | movq %4, -8(%FP)
|
Copying the local variable to the global variable
Here you need two registers, one for fetching the local variable and one for storing the label of the global variable:
1 2 3 | movq -8(%FP),%4
ldzwq arg, %5
movq %4, (%5)
|
Copying the pointer to a string to a global variable
For using the puts subprogram you first have to copy the address of a string, i.e. a pointer to the string, into the global variable.
With
1 | ldzwq msg, %4
|
you copy the address of the first character of the string into register %4, i.e. now %4 points to the string. Then you copy this address to the memory location of the global variable:
1 2 | ldzwq arg, %5
movq %4, (%5)
|
The memory location of the global variable can now be interpreted as a pointer to the string.