================= Alignment of data [TOC] ================= Fetch and store operations on the ULM require that the address of the data is a multiple of the data size. That means, a quad word (8 bytes) can only be fetched from an address that is a multiple of 8 or stored to an address that is a multiple of 8. Analogously fetching or storing a long word (4 bytes) requires that the address is a multiple of 4, and for a word (2 bytes) a multiple of 2. Only a single byte can be fetched from or stored to an arbitrary address. ---- VIDEO ------------------------------ https://www.youtube.com/embed/fuoisq2J9oA ----------------------------------------- Example for a bus error (bad alignment) ======================================= Just try to run the following mini program, consisting of a instructions that store data at address $0x24$ (or $36$ in decimal): ---- CODE (file=session10/memop/bus_error.s) ----------------------------------- ldzwq 0x24, %1 movb %0, (%1) movw %0, (%1) movl %0, (%1) movq %0, (%1) -------------------------------------------------------------------------------- You don't need a halt instruction here because it is sure to crash when the the last instruction (the `movq`) gets executed: ---- SHELL (path=session10/memop) ---------------------------------------------- ulmas -o bus_error bus_error.s ulm bus_error -------------------------------------------------------------------------------- Storing a byte always works for any address, so it is clear that the `movb` will not crash. Storing a word requires an even address, which is the case here ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.3} \DrawMemArrayOpen{0}{16} \DrawHack{ 0}{$2^2$} \DrawWordVariable[red!40]{4}{aligned word} \DrawMemAddress{0}{0x20} \DrawMemAddress{4}{0x24} \DrawMemAddress{8}{0x28} \DrawMemAddress{12}{0x2C} \DrawMemAddress{16}{0x30} \end{tikzpicture} -------------------------------------------------------------------------------- Storing a long word is no problem because the address is also aligned to 4 bytes: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.3} \DrawMemArrayOpen{0}{16} \DrawHack{ 0}{$2^2$} \DrawLongVariable[red!40]{4}{aligned long word} \DrawMemAddress{0}{0x20} \DrawMemAddress{4}{0x24} \DrawMemAddress{8}{0x28} \DrawMemAddress{12}{0x2C} \DrawMemAddress{16}{0x30} \end{tikzpicture} -------------------------------------------------------------------------------- So what happened is that the third instruction tried to store a quad word at address 36 (or 0x24 in hex): ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.3} \DrawMemArrayOpen{0}{16} \DrawHack{ 0}{$2^2$} \DrawQuadVariable[red!40]{4}{unaligned quad word} \DrawMemAddress{0}{0x20} \DrawMemAddress{4}{0x24} \DrawMemAddress{8}{0x28} \DrawMemAddress{12}{0x2C} \DrawMemAddress{16}{0x30} \end{tikzpicture} -------------------------------------------------------------------------------- Why is there such a handicap? ----------------------------- It is easier to build the hardware if you have this restriction. And if you want or need, you still can fetch/store any data size from any address, you just need for an unaligned access more then one instruction. Why did the `movq` instructions work in the stack example? ---------------------------------------------------------- The stack pointer was initialized with 0 (which is a multiple of 8) and only decremented or incremented by 8. So the address $u\left(\%SP\right)$ was always aligned to 8 bytes. Using aligned data ================== The directives `.word`, `.long` and `.quad` generate in general additional bytes for padding so that the actual data will be aligned when a program is loaded. Following example ---- CODE (file=session10/align/data.s) ---------------------------------------- .data .byte 1 .quad 2 .byte 3 .long 4 .byte 5 .word 6 -------------------------------------------------------------------------------- generates the following memory layout ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {0.7} \DrawHack{32}{$2^2$} \DrawMemArrayOpenRight{0}{31} \DrawMemAddress{0}{0x00} \DrawMemAddress{4}{0x04} \DrawMemAddress{8}{0x08} \DrawMemAddress{12}{0x0C} \DrawMemAddress{16}{0x10} \DrawMemAddress{20}{0x14} \DrawMemAddress{24}{0x18} \DrawMemAddress{28}{0x1C} \DrawMemAddress{32}{0x20} \DrawByteVariable[orange!40]{0}{} \DrawQuadVariable[orange!40]{8}{} \DrawByteVariable[orange!40]{16}{} \DrawLongVariable[orange!40]{20}{} \DrawByteVariable[orange!40]{24}{} \DrawWordVariable[orange!40]{26}{} \DrawMemCellContent{0}{01} \DrawMemVariable[gray!20]{1}{8}{} \DrawAnnotateMemCellAbove{1}{padding bytes for .quad} \DrawMemCellContent{1}{00} \DrawMemCellContent{2}{00} \DrawMemCellContent{3}{00} \DrawMemCellContent{4}{00} \DrawMemCellContent{5}{00} \DrawMemCellContent{6}{00} \DrawMemCellContent{7}{00} \DrawMemCellContent{8}{00} \DrawMemCellContent{9}{00} \DrawMemCellContent{10}{00} \DrawMemCellContent{11}{00} \DrawMemCellContent{12}{00} \DrawMemCellContent{13}{00} \DrawMemCellContent{14}{00} \DrawMemCellContent{15}{02} \DrawMemCellContent{16}{03} \DrawMemVariable[gray!20]{17}{20}{} \DrawAnnotateMemCellAbove{17}{padding bytes for .long} \DrawMemCellContent{17}{00} \DrawMemCellContent{18}{00} \DrawMemCellContent{19}{00} \DrawMemCellContent{20}{00} \DrawMemCellContent{21}{00} \DrawMemCellContent{22}{00} \DrawMemCellContent{23}{04} \DrawMemCellContent{24}{05} \DrawMemVariable[gray!20]{25}{26}{} \DrawAnnotateMemCellAbove{25}{padding byte for .word} \DrawMemCellContent{25}{00} \DrawMemCellContent{26}{00} \DrawMemCellContent{27}{06} \end{tikzpicture} -------------------------------------------------------------------------------- Translate the above code with ---- SHELL (path=session10/align) ---------------------------------------------- ulmas -o data data.s -------------------------------------------------------------------------------- and look at the assembler output: :import: session10/align/data Pitfalls when using labels for aligned data ------------------------------------------- Although the data generated by the directive is aligned using labels like this will not have the effect you might expect: ---- CODE (file=session10/align/data_label_fail.s) ----------------------------- .data a .byte 1 b .quad 2 c .byte 3 d .long 4 e .byte 5 f .word 6 -------------------------------------------------------------------------------- Recall, labels refer to the address of the first byte generated for the next instruction or pseudo operation. If a directive requires than this would be the address of a padding byte. Hence, the labels refer to address as illustrated here: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \DrawHack{32}{$2^2$} \renewcommand\MemCellWidth { 0.7} \DrawMemArrayOpenRight{0}{32} \DrawMemAddress{0}{0x00} \DrawMemAddress{4}{0x04} \DrawMemAddress{8}{0x08} \DrawMemAddress{12}{0x0C} \DrawMemAddress{16}{0x10} \DrawMemAddress{20}{0x14} \DrawMemAddress{24}{0x18} \DrawMemAddress{28}{0x1C} \DrawMemAddress{32}{0x20} \DrawByteVariable[orange!40]{0}{} \DrawQuadVariable[orange!40]{8}{} \DrawByteVariable[orange!40]{16}{} \DrawLongVariable[orange!40]{20}{} \DrawByteVariable[orange!40]{24}{} \DrawWordVariable[orange!40]{26}{} \DrawMemCellContent{0}{01} \DrawMemVariable[gray!20]{1}{8}{} \DrawMemCellContent{1}{00} \DrawMemCellContent{2}{00} \DrawMemCellContent{3}{00} \DrawMemCellContent{4}{00} \DrawMemCellContent{5}{00} \DrawMemCellContent{6}{00} \DrawMemCellContent{7}{00} \DrawMemCellContent{8}{00} \DrawMemCellContent{9}{00} \DrawMemCellContent{10}{00} \DrawMemCellContent{11}{00} \DrawMemCellContent{12}{00} \DrawMemCellContent{13}{00} \DrawMemCellContent{14}{00} \DrawMemCellContent{15}{02} \DrawMemCellContent{16}{03} \DrawMemVariable[gray!20]{17}{20}{} \DrawMemCellContent{17}{00} \DrawMemCellContent{18}{00} \DrawMemCellContent{19}{00} \DrawMemCellContent{20}{00} \DrawMemCellContent{21}{00} \DrawMemCellContent{22}{00} \DrawMemCellContent{23}{04} \DrawMemCellContent{24}{05} \DrawMemVariable[gray!20]{25}{26}{} \DrawMemCellContent{25}{00} \DrawMemCellContent{26}{00} \DrawMemCellContent{27}{06} \DrawMemLabel{0}{a} \DrawMemLabel{1}{b} \DrawMemLabel{16}{c} \DrawMemLabel{17}{d} \DrawMemLabel{24}{e} \DrawMemLabel{25}{f} \end{tikzpicture} -------------------------------------------------------------------------------- Translate the above code with ---- SHELL (path=session10/align) ---------------------------------------------- ulmas -o data_label_fail data_label_fail.s -------------------------------------------------------------------------------- and look at the assembler output: :import: session10/align/data_label_fail Look at the entries of the symbol table to see what addresses the assembler associates with the label. Conform that this is equivalent to what you saw in the above sketch of the memory layout. Using the `.align` directive ---------------------------- The `.align` directive instructs the assembler to generate padding bytes until the next address has a specified alignment. This can be used such that data directives don't have to implicitly generate such padding bytes: ---- CODE (file=session10/align/data_align_label.s) ---------------------------- .data a .byte 1 .align 8 # generate padding bytes until next address is aligned to 8 b .quad 2 c .byte 3 .align 4 # generate padding bytes until next address is aligned to 4 d .long 4 e .byte 5 .align 2 # generate padding bytes until next address is aligned to 2 f .word 6 -------------------------------------------------------------------------------- Then the labels have the desired effect to refer to the address of the actual data and not to address of a padding byte: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 0.7} \DrawHack{32}{$2^2$} \DrawMemArrayOpenRight{0}{32} \DrawMemAddress{0}{0x00} \DrawMemAddress{4}{0x04} \DrawMemAddress{8}{0x08} \DrawMemAddress{12}{0x0C} \DrawMemAddress{16}{0x10} \DrawMemAddress{20}{0x14} \DrawMemAddress{24}{0x18} \DrawMemAddress{28}{0x1C} \DrawMemAddress{32}{0x20} \DrawByteVariable[orange!40]{0}{} \DrawQuadVariable[orange!40]{8}{} \DrawByteVariable[orange!40]{16}{} \DrawLongVariable[orange!40]{20}{} \DrawByteVariable[orange!40]{24}{} \DrawWordVariable[orange!40]{26}{} \DrawMemCellContent{0}{01} \DrawMemVariable[gray!20]{1}{8}{} \DrawMemCellContent{1}{00} \DrawMemCellContent{2}{00} \DrawMemCellContent{3}{00} \DrawMemCellContent{4}{00} \DrawMemCellContent{5}{00} \DrawMemCellContent{6}{00} \DrawMemCellContent{7}{00} \DrawMemCellContent{8}{00} \DrawMemCellContent{9}{00} \DrawMemCellContent{10}{00} \DrawMemCellContent{11}{00} \DrawMemCellContent{12}{00} \DrawMemCellContent{13}{00} \DrawMemCellContent{14}{00} \DrawMemCellContent{15}{02} \DrawMemCellContent{16}{03} \DrawMemVariable[gray!20]{17}{20}{} \DrawMemCellContent{17}{00} \DrawMemCellContent{18}{00} \DrawMemCellContent{19}{00} \DrawMemCellContent{20}{00} \DrawMemCellContent{21}{00} \DrawMemCellContent{22}{00} \DrawMemCellContent{23}{04} \DrawMemCellContent{24}{05} \DrawMemVariable[gray!20]{25}{26}{} \DrawMemCellContent{25}{00} \DrawMemCellContent{26}{00} \DrawMemCellContent{27}{06} \DrawMemLabel{0}{a} \DrawMemLabel{8}{b} \DrawMemLabel{16}{c} \DrawMemLabel{20}{d} \DrawMemLabel{24}{e} \DrawMemLabel{26}{f} \end{tikzpicture} -------------------------------------------------------------------------------- Translate the above code with ---- SHELL (path=session10/align) ---------------------------------------------- ulmas -o data_align_label data_align_label.s -------------------------------------------------------------------------------- and look at the assembler output: :import: session10/align/data_align_label Again, compare the entries in the symbol table what was shown above. Quiz09: Using an exercise about alignment to talk about endianness ================================================================== When a chunk of data consists of more than one byte the __endianness__ of an architecture specifies in which order they are stored in memory. The ULM uses the so called _big endian_ format. That means the _most significant byte_ gets stored at the address of the memory location, each less significant byte gets stored at the next higher address. Hence, storing the quad word `0x123456789ABCDEF0` at some address $A$ would have the following memory layout: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.6} \DrawHack{1}{$2^2$} \DrawMemArrayOpen{-10}{-1} \DrawMemAddressAbove{-9}{A} \DrawMemVariable[green!50]{-9}{-1}{} \DrawAnnotateMemCellAbove{-5.5}{quad word in big endian format} \DrawMemCellContent{-9}{12} \DrawMemCellContent{-8}{34} \DrawMemCellContent{-7}{56} \DrawMemCellContent{-6}{78} \DrawMemCellContent{-5}{9A} \DrawMemCellContent{-4}{BC} \DrawMemCellContent{-3}{DE} \DrawMemCellContent{-2}{F0} \end{tikzpicture} -------------------------------------------------------------------------------- The main reason for choosing the big endian format for the ULM is that you see the bytes in memory as you would write them down. Other architectures use the _little endian_ format where the byte order is reversed. So the same quad word would be stored as follows: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.6} \DrawHack{1}{$2^2$} \DrawMemArrayOpen{-10}{-1} \DrawMemAddressAbove{-9}{A} \DrawMemVariable[green!50]{-9}{-1}{} \DrawAnnotateMemCellAbove{-5.5}{quad word in little endian format} \DrawMemCellContent{-2}{12} \DrawMemCellContent{-3}{34} \DrawMemCellContent{-4}{56} \DrawMemCellContent{-5}{78} \DrawMemCellContent{-6}{9A} \DrawMemCellContent{-7}{BC} \DrawMemCellContent{-8}{DE} \DrawMemCellContent{-9}{F0} \end{tikzpicture} -------------------------------------------------------------------------------- When you use instructions to fetch and store the complete chunk of data you actually don't have to care about the endianness. And unless you look at the memory layout, e.g. with the ulm-qui, you wouldn't even notice what endianness is used. Typical practical cases where endianness matter are exchanging data between computers (e.g. sending/receiving data over the internet or by exchanging binary files). You also have to deal with endianness if you have to cherry pick single bytes from a word, long word or quad word that is stored in memory. And accessing single bytes is an operation that you need when data is not aligned. What you have to do: Store a quad word byte by byte at an unaligned address --------------------------------------------------------------------------- Write a (small) program that stores the quad word `0x123456789ABCDEF0` at address $u\left(2^{64}-9\right)$ which is in hex representation `0xFFFFFFFFFFFFFFF7`. The quad word should be stored in the big endian format, i.e. after the store operation you should have the following memory layout: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.6} \DrawHack{1}{$2^2$} \DrawMemArrayOpenLeft{-10}{-1} \DrawMemAddressAbove{0}{$2^{64}$} \DrawMemAddressAbove{-8}{$2^{64}-8$} \DrawMemVariable[green!50]{-9}{-1}{} \DrawAnnotateMemCellAbove{-5.5}{quad word in little endian format} \DrawMemCellContent{-9}{12} \DrawMemCellContent{-8}{34} \DrawMemCellContent{-7}{56} \DrawMemCellContent{-6}{78} \DrawMemCellContent{-5}{9A} \DrawMemCellContent{-4}{BC} \DrawMemCellContent{-3}{DE} \DrawMemCellContent{-2}{F0} \DrawPointer{-9}{\%2} \end{tikzpicture} -------------------------------------------------------------------------------- On `theon` submit your program with the command `submit hpc quiz09 bigendian.s` Your answers will not be evaluated automatically but you should get a confirmation email after your submit. Example: Storing a quad word byte by byte in the little endian format --------------------------------------------------------------------- ---- CODE (type=s) ------------------------------------------------------------- # store some 64-bit literal in register %1 ldzwq @w3(0x123456789ABCDEF0), %1 shldwq @w2(0x123456789ABCDEF0), %1 shldwq @w1(0x123456789ABCDEF0), %1 shldwq @w0(0x123456789ABCDEF0), %1 # load some unaligned address in register %2 ldswq -9, %2 # store %1 byte by byte in the quad word at address %2 movb %1, (%2) shrq 8, %1, %1 movb %1, 1(%2) shrq 8, %1, %1 movb %1, 2(%2) shrq 8, %1, %1 movb %1, 3(%2) shrq 8, %1, %1 movb %1, 4(%2) shrq 8, %1, %1 movb %1, 5(%2) shrq 8, %1, %1 movb %1, 6(%2) shrq 8, %1, %1 movb %1, 7(%2) halt 0 -------------------------------------------------------------------------------- ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth { 1.6} \DrawMemArrayOpenLeft{-10}{-1} \DrawHack{1}{$2^2$} \DrawMemAddressAbove{0}{$2^{64}$} \DrawMemAddressAbove{-8}{$2^{64}-8$} \DrawMemCellContent{-9}{$\left(\%2\right)$} \DrawMemCellContent{-8}{$1\left(\%2\right)$} \DrawMemCellContent{-7}{$2\left(\%2\right)$} \DrawMemCellContent{-6}{$3\left(\%2\right)$} \DrawMemCellContent{-5}{$4\left(\%2\right)$} \DrawMemCellContent{-4}{$5\left(\%2\right)$} \DrawMemCellContent{-3}{$6\left(\%2\right)$} \DrawMemCellContent{-2}{$7\left(\%2\right)$} \DrawPointer{-9}{\%2} \end{tikzpicture} -------------------------------------------------------------------------------- After storing the complete quad word the memory layout is ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.6} \DrawHack{1}{$2^2$} \DrawMemArrayOpenLeft{-10}{-1} \DrawMemAddressAbove{0}{$2^{64}$} \DrawMemAddressAbove{-8}{$2^{64}-8$} \DrawMemVariable[green!50]{-9}{-1}{} \DrawAnnotateMemCellAbove{-5.5}{quad word in little endian format} \DrawMemCellContent{-2}{12} \DrawMemCellContent{-3}{34} \DrawMemCellContent{-4}{56} \DrawMemCellContent{-5}{78} \DrawMemCellContent{-6}{9A} \DrawMemCellContent{-7}{BC} \DrawMemCellContent{-8}{DE} \DrawMemCellContent{-9}{F0} \DrawPointer{-9}{\%2} \end{tikzpicture} -------------------------------------------------------------------------------- Assignment: Pointer chasing =========================== We have to use pointers in assembly all the time, but you never can have to much practise in that respect. That is because we don't have types like in C for doing the bookkeeping. You have to keep track of the size and meaning of a memory location. In this assignment you will practise that by writing a program that uses the following data structure: ---- CODE (file=session10/align/circular_data.s) ------------------------------- .data .align 8 a .quad c # pointer to next node .long 'A' # value of the node .align 8 b .quad d # pointer to next node .long 'B' # value of the node .align 8 c .quad b # pointer to next node .long 'C' # value of the node .align 8 d .quad a # pointer to next node .long 'D' # value of the node -------------------------------------------------------------------------------- This code fragment describes a __circular linked list__ stored in the data segment: - When you visualize the data segment you can differentiate that some memory locations are supposed to be interpreted as pointer to a node and others as values of the node: ---- TIKZ -------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {0.3} \DrawMemArrayOpen{0}{64} \DrawHack{ 0}{$0$} \DrawQuadVariable{0}{c} \DrawLongVariable{8}{\footnotesize 'A'} \DrawQuadVariable{16}{d} \DrawLongVariable{24}{\footnotesize 'B'} \DrawQuadVariable{32}{b} \DrawLongVariable{40}{\footnotesize 'C'} \DrawQuadVariable{48}{a} \DrawLongVariable{56}{\footnotesize 'D'} \begingroup \renewcommand\PointerDisplaceY{0.7} \DrawMemPointer[red!50]{4}{32} \par\endgroup \begingroup \renewcommand\PointerDisplaceY{0.7} \DrawMemPointerAbove[green!50]{20}{48} \par\endgroup \begingroup \renewcommand\PointerDisplaceY{0.7} \DrawMemPointerAbove[blue!50]{36}{16} \par\endgroup \begingroup \renewcommand\PointerDisplaceY{1.3} \DrawMemPointer[orange!50]{52}{0} \par\endgroup \renewcommand\MemLabelHeight{0.7} \DrawMemLabel{0}{a} \DrawMemLabel{16}{b} \DrawMemLabel{32}{c} \DrawMemLabel{48}{d} \end{tikzpicture} ------------------------------------------------------------------------------ - It is helpful to understand how this picture above is technically described in the assembler output. So translate this code fragment ---- SHELL (path=session10/align) -------------------------------------------- ulmas -o circular_data circular_data.s ------------------------------------------------------------------------------ and see how the symbol table in combination with the data segment can be used to infer the meaning of the memory locations: :import: session10/align/circular_data For the assignment write a program that prints the values of all nodes, beginning with the node that has label `a`, and then halts. You can implement the following algorithm: ---- TIKZ ---------------------------------------------------------------------- \begin{adjustbox}{} \textcolor{white}{.$2^{64}$} \begin{varwidth}{10cm} \algrenewcommand\algorithmicrepeat{\textbf{do}} \algrenewcommand\algorithmicuntil{\textbf{until}} \algrenewcommand\textproc{} \begin{algorithmic} \State \textbf{p} := \textbf{start} := pointer to node \textbf{a} \Repeat \State print value of node at \textbf{p} \State \textbf{p} := successor of node at \textbf{p} \Until{$\textbf{p}=\textbf{start}$} \end{algorithmic} \end{varwidth} \end{adjustbox} -------------------------------------------------------------------------------- Keep your program minimalistic, you don't have to use functions here. It is also sufficient to print the least significant byte of the value (this example uses the `.long` directive because in the picture the characters would not fit in a box that has the size of a single byte). But also make yourself aware that the `.long` is always aligned to 4 bytes because the preceding `.quad` is aligned to 8 bytes. On `theon` submit your program with the command `submit hpc quiz10 pointer_chasing.s` Your answers will not be evaluated automatically but you should get a confirmation email after your submit. For getting started ~~~~~~~~~~~~~~~~~~~ The following program prints the value of the node at label `a` and the value of its successor: ---- CODE (type=s) ------------------------------------------------------------- .data .align 8 a .quad c # pointer to next node .long 'A' # value of the node .align 8 b .quad d # pointer to next node .long 'B' # value of the node .align 8 c .quad b # pointer to next node .long 'C' # value of the node .align 8 d .quad a # pointer to next node .long 'D' # value of the node .text ldzwq a, %1 movzlq 8(%1), %2 // fetch value of node movq (%1), %1 // fetch pointer to next node putc %2 movzlq 8(%1), %2 // fetch value of node movq (%1), %1 // fetch pointer to next node putc %2 putc '\n' halt 0 -------------------------------------------------------------------------------- Using the directives ---- CODE (type=s) ------------------------------------------------------------- .equ node_next, 0 .equ node_value, 8 -------------------------------------------------------------------------------- you can write the instructions for fetching the value and pointer to the next node more readable: ---- CODE (type=s) ------------------------------------------------------------- movzlq node_value(%1), %2 // fetch value of node movq node_next(%1), %1 // fetch pointer to next node -------------------------------------------------------------------------------- The following code applies this to the previous _getting started_ example: ---- CODE (file=session10/circular/ex.s,fold) ---------------------------------- .data .align 8 a .quad c # pointer to next node .long 'A' # value of the node .align 8 b .quad d # pointer to next node .long 'B' # value of the node .align 8 c .quad b # pointer to next node .long 'C' # value of the node .align 8 d .quad a # pointer to next node .long 'D' # value of the node .equ next, 0 .equ value, 8 .text ldzwq a, %1 movzlq node_value(%1), %2 movq node_next(%1), %1 putc %2 movzlq node_value(%1), %2 movq node_next(%1), %1 putc %2 putc '\n' halt 0 -------------------------------------------------------------------------------- # About structs in C # ~~~~~~~~~~~~~~~~~~ # Using the directives # # ---- CODE (type=s) ------------------------------------------------------------- # .equ node_next, 0 # .equ node_value, 8 # -------------------------------------------------------------------------------- # # you can write the instructions for fetching the value and pointer to the next # node more readable: # # ---- CODE (type=s) ------------------------------------------------------------- # movzlq node_value(%1), %2 // fetch value of node # movq node_next(%1), %1 // fetch pointer to next node # -------------------------------------------------------------------------------- # # The following code applies this to the previous _getting started_ example: # # ---- CODE (file=session10/circular/ex.s,fold) ---------------------------------- # .data # # .align 8 # a .quad c # pointer to next node # .long 'A' # value of the node # # .align 8 # b .quad d # pointer to next node # .long 'B' # value of the node # # .align 8 # c .quad b # pointer to next node # .long 'C' # value of the node # # .align 8 # d .quad a # pointer to next node # .long 'D' # value of the node # # # .equ next, 0 # .equ value, 8 # # .text # # ldzwq a, %1 # # movzlq node_value(%1), %2 # movq node_next(%1), %1 # putc %2 # # movzlq node_value(%1), %2 # movq node_next(%1), %1 # putc %2 # # putc '\n' # # halt 0 # -------------------------------------------------------------------------------- # # In C you can define so called _struct_ types, for example # # ---- CODE (file=session10/gcc/struct_def.c) ------------------------------------ # struct Foo { # char a; # int b; # char c; # }; # -------------------------------------------------------------------------------- # # Defining a type is not generating and machine code. You can check this by # looking what assembly code gets generated by the C compiler: # # ---- SHELL (path=session10/gcc) ------------------------------------------------ # gcc -o struct_def_from_gcc.s -S struct_def.c # cat struct_def_from_gcc.s # ulmcc -o struct_def_from_ulmcc.s struct_def.c # cat struct_def_from_ulmcc.s # -------------------------------------------------------------------------------- # # The sole purpose of types is related to bookkeeping the meaning of memory # locations. In the above code the type `struc Foo` was defined for variables that # consist of three fields `a`, `b` and `c`. Each of these fields has also a type # # If you have a variable in C it occupies some memory location on the data # segment, BSS segment, the stack or the so called _heap_. In order to give you an # idea what bookkeeping means consider this code snippet which defines two global # variables of type `struct Foo`. That means the memory locations will be in the # data segment in the same order they where defined: # # ---- CODE (file=session10/gcc/struct_vars.c) ------------------------------------ # struct Foo { # char a; # int b; # char c; # }; # # struct Foo A = {'a', 42, 'A'}; # struct Foo B = {'b', 13, 'B'}; # -------------------------------------------------------------------------------- # # # ---- TIKZ ---------------------------------------------------------------------- # \begin{tikzpicture} # \input{memory.tex} # # \renewcommand\MemCellWidth {0.6} # # \DrawMemArrayOpen{0}{32} # # \DrawHack{ 0}{$0$} # # \begingroup # \renewcommand\PaddingMemVariable {0.01} # \DrawMemVariable{0}{12}{} # \renewcommand\PointerDisplaceY {2} # \DrawAnnotateMemCellAbove{3}{memory location of struct A} # \par\endgroup # # \begingroup # \renewcommand\PaddingMemVariable {0.01} # \DrawMemVariable{16}{28}{} # \renewcommand\PointerDisplaceY {2} # \DrawAnnotateMemCellAbove{19}{memory location of struct B} # \par\endgroup # # # \DrawByteVariable{0}{} # \DrawLongVariable{4}{} # \DrawByteVariable{8}{} # # # \begingroup # \renewcommand\PointerDisplaceY {0.5} # \DrawAnnotateMemCell{0}{memory location of field a} # \par\endgroup # # \begingroup # \renewcommand\PointerDisplaceY {1.2} # \DrawAnnotateMemCell{4}{memory location of field b} # \par\endgroup # # \begingroup # \renewcommand\PointerDisplaceY {1.9} # \DrawAnnotateMemCell{8}{memory location of field c} # \par\endgroup # # # \DrawByteVariable{16}{} # \DrawLongVariable{20}{} # \DrawByteVariable{24}{} # # \begingroup # \renewcommand\PointerDisplaceY {0.5} # \DrawAnnotateMemCell{16}{memory location of field a} # \par\endgroup # # \begingroup # \renewcommand\PointerDisplaceY {1.2} # \DrawAnnotateMemCell{20}{memory location of field b} # \par\endgroup # # \begingroup # \renewcommand\PointerDisplaceY {1.9} # \DrawAnnotateMemCell{24}{memory location of field c} # \par\endgroup # # # \renewcommand\MemLabelHeight{0.7} # \DrawMemLabel{0}{A} # \DrawMemLabel{16}{B} # # \end{tikzpicture} # -------------------------------------------------------------------------------- How you could define the above circular list in C ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In C you could define the same data structure in the data segment with the following code: ---- CODE (file=session10/gcc/circular_data.c) --------------------------------- /* variables of type "struct Node" have two fields: - pointer to next node - some value */ struct Node { struct Node *next; long value; }; /* predecalre all nodes so that we in the initialization of a node we can point to a node that gets defined later: */ extern struct Node a; extern struct Node b; extern struct Node c; extern struct Node d; /* define all nodes as global variables: */ struct Node a = { &c, // &c is the address of global variable b 'A' }; struct Node b = { &d, // &d is the address of global variable b 'B' }; struct Node c = { &b, // &b is the address of global variable b 'C' }; struct Node d = { &a, // &a is the address of global variable b 'D' }; -------------------------------------------------------------------------------- Translate this into assembly code with ---- SHELL (path=session10/gcc) ------------------------------------------------ gcc -S circular_data.c -------------------------------------------------------------------------------- and have a look at the code the C compiler generated. Most of what you see should look familiar: :import: session10/gcc/circular_data.s :links: endianness -> https://en.wikipedia.org/wiki/Endianness circular linked list -> https://en.wikipedia.org/wiki/Linked_list#Circular_linked_list