=============== Quiz: Quicksort =============== In this quiz you are supposed to write an assembly program with some more complex functions. The program should first print a string, then sort the characters of the string and finally print the sorted result: ---- CODE (type=txt) ----------------------------------------------------------- theon$ quicksort hello, world! !,dehllloorw -------------------------------------------------------------------------------- For sorting you should write a function that implements the __Quicksort__ algorithm. However, this assignment is not about the theoretical background of sorting algorithms. The purpose is to practise the calling conventions that where introduced in this session. Furthermore, you should get some idea how a compiler can translate C code into assembly code (with all optimization flags turned of, i.e. `-O0`). For that sake you will find a complete C implementation for the assigned problem and also a variant rewritten in __spaghetti code__. Translate the spaghetti code statement by statement into assembly code. On `theon` submit your program with the command `submit hpc quiz12 quicksort.s` Reading assignment ================== This interview with a Microsoft developer is more or less on-topic __Russell Hadley: The Route to C++ Code Optimization__. He works on the C++ compiler team and gives a not to technical talk some overview how compilers work, apply optimizations and finally generate assembly code. I think a lot will make sense to you after this session, or even sound familiar to you by know. For example, at minute 1:30 we mentions how nice C++ code gets translated into spaghetti code). It's also interesting to see that the guy is not using the Visual Studio IDE but Emacs (besides Vim the only legit alternative). I also stumbled over this article __The Mistakes I Made As a Beginner Programmer__. This is not directly related to the topic of Session 10 but about programming in general (it does not matter that he uses examples from JavaScript). It's great and you should read it. I love and totally agree to quotes like "For example, if you are not consistent with your indentation and capitalization, you should simply lose your license to code." or "Expert programmers love errors. Newbies hate them.". About the subscript notation used in the C code examples ======================================================== In the functions you will see the notation `A[i]` which is equivalent to `*(A+i)`. The variable `A` is declared with type `char *`, hence it is a pointer to a character ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {0.9} \DrawMemArrayOpen{0}{10} \DrawMemVariable[orange!50]{1}{10}{"hello, world!"} \DrawHack{0}{$0$} \begingroup \renewcommand\PaddingMemVariable {0.01} \renewcommand\PaddingMemVariable {0.05} \renewcommand\PointerDisplaceY {2} \par\endgroup \DrawMemArrayOpen{15}{25} \begingroup \renewcommand\PaddingMemVariable {0.01} \DrawMemVariable[red!50]{16}{24}{variable A} \renewcommand\PaddingMemVariable {0.05} \renewcommand\PointerDisplaceY {2} \par\endgroup \DrawMemPointer{18}{1} \end{tikzpicture} -------------------------------------------------------------------------------- With `*(A+i)` the character with address `A+i` is dereferenced: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.5} \DrawMemArrayOpen{0}{6} \DrawMemCellContent{1}{*(A+0)} \DrawMemCellContent{2}{*(A+1)} \DrawMemCellContent{3}{*(A+2)} \DrawMemCellContent{4}{*(A+3)} \DrawMemCellContent{5}{*(A+4)} \DrawMemCellContent{6}{$\dots$} \begingroup \renewcommand\PaddingMemVariable {0.01} \renewcommand\PaddingMemVariable {0.05} \renewcommand\PointerDisplaceY {2} \par\endgroup \DrawMemArrayOpen{10}{11} \DrawHack{0}{$0$} \begingroup \renewcommand\PaddingMemVariable {0.01} \DrawMemVariable[red!50]{10}{12}{variable A} \renewcommand\PaddingMemVariable {0.05} \renewcommand\PointerDisplaceY {2} \DrawAnnotateMemCellAbove{10}{for bookkeeping: variable A has type char *} \par\endgroup \DrawMemPointer{10}{1} \end{tikzpicture} -------------------------------------------------------------------------------- And `A[i]` is just a more readable notation for that: ---- TIKZ ---------------------------------------------------------------------- \begin{tikzpicture} \input{memory.tex} \renewcommand\MemCellWidth {1.5} \DrawMemArrayOpen{0}{6} \DrawHack{0}{$0$} \DrawMemCellContent{1}{A[0]} \DrawMemCellContent{2}{A[1]} \DrawMemCellContent{3}{A[2]} \DrawMemCellContent{4}{A[3]} \DrawMemCellContent{5}{A[4]} \DrawMemCellContent{6}{$\dots$} \begingroup \renewcommand\PaddingMemVariable {0.01} \renewcommand\PaddingMemVariable {0.05} \renewcommand\PointerDisplaceY {2} \par\endgroup \DrawMemArrayOpen{10}{11} \begingroup \renewcommand\PaddingMemVariable {0.01} \DrawMemVariable[red!50]{10}{12}{variable A} \renewcommand\PaddingMemVariable {0.05} \renewcommand\PointerDisplaceY {2} \DrawAnnotateMemCellAbove{10}{for bookkeeping: variable A has type char *} \par\endgroup \DrawMemPointer{10}{1} \end{tikzpicture} -------------------------------------------------------------------------------- In the assembly code the you can use `movzbq (%X, %Y), %Z` for fetching and `movb %X, (%Y, %Z)` for storing the i-the character: - If `%X` contains the value of `A`, i.e. the address of the first character, and `%Y` contains the index `i` then `movzbq (%X, %Y), %Z` fetches `A[i]` into register `%Z`. - If `%Y` contains the value of `A`, i.e. the address of the first character, and `%Z` contains the index `i` then `movb %X, (%Y, %Z)` stores the least significant byte from `%X` in the memory location `A[i]`. When we discuss _pointer arithmetic in C_ you will see why it makes sense that the fetch and store operations in the instruction set have further variants with a scaling parameter. Here the secret in short: When you have a pointer `A` then in general `A+i` is _not_ the address of `A` plus `i`. Instead it is the address of `A` plus `i` times the size of the elements. For example, if `A` points to quad word integers, let's say it has type `int64_t *` or type `uint64_t *`, then "`A + i`" means "address stored in `A` plus `8*i`". In that case, if the value of variable `A` is in `%X` and the value of `i` in `%Y` then `movq (%X, %Y, 8), %Z` can be used for fetching a quad word into `%Z`. And vice versa, with `movq %X, (%Y, %Z, 8)` an instruction is provided that can be used for storing an array element. Example for sorting a string with bubble sort ============================================= This program is basically solving the problem with the __bubble sort__ algorithm instead of the Quicksort. For translating this manually into assembly code the same approach is followed. Some C code first get rewritten in a primitive variant that then can be translated statement by statement (and you will find here all ingredients needed for the quicksort assignment). This is the initial C code: :import: session10/func/bubblesort.c [fold] As we are reimplementing `puts` and `strlen` from the C library (with a slightly different function signature) you get some annoying warnings when you compile that. For ignoring this you can use the option `-Wno-builtin-declaration-mismatch`: ---- SHELL (path=session10/func) ----------------------------------------------- gcc -Wno-builtin-declaration-mismatch -o bubblesort bubblesort.c bubblesort -------------------------------------------------------------------------------- The bubble sort example with C spaghetti code --------------------------------------------- Procedure `bublesort` has 3 local variables. It is conveninet that you can declare local variables within compound statements. Hovever, technically it is eqivalent to declare all local variables at the beginning of the function body: + ---- CODE -------------------------------------------------------------------- void bubblesort(char *A, uint64_t n) { uint64_t swapped; do { swapped = 0; for (uint64_t i=1; i A[i]) { char tmp = A[i]; A[i] = A[i-1]; A[i-1] = tmp; swapped = 1; } } --n; } while (swapped); } ------------------------------------------------------------------------------ + ---- CODE -------------------------------------------------------------------- void bubblesort(char *A, uint64_t n) { uint64_t swapped; uint64_t i; char tmp; do { swapped = 0; for (i=1; i A[i]) { tmp = A[i]; A[i] = A[i-1]; A[i-1] = tmp; swapped = 1; } } --n; } while (swapped); } ------------------------------------------------------------------------------ Next the nice to read loop-statements and the if-statements can be rewritten with more primitive statements that are easy to translate into chunks of assembly code: + ---- CODE -------------------------------------------------------------------- void bubblesort(char *A, uint64_t n) { uint64_t swapped; uint64_t i; char tmp; do { swapped = 0; for (i=1; i A[i]) { tmp = A[i]; A[i] = A[i-1]; A[i-1] = tmp; swapped = 1; } } --n; } while (swapped); } ------------------------------------------------------------------------------ + ---- CODE -------------------------------------------------------------------- void bubblesort(char *A, uint64_t n) { uint64_t swapped; uint64_t i; char tmp; bubblesort_do: swapped = 0; i = 1; bubblesort_for: if (i >= n) goto bubblesort_endfor; if (A[i-1] <= A[i]) goto bubblesort_endif; tmp = A[i]; A[i] = A[i-1]; A[i-1] = tmp; swapped = 1; bubblesort_endif: i = i + 1; goto bubblesort_for; n = n - 1; bubblesort_endfor: if (swapped) goto bubblesort_do; } ------------------------------------------------------------------------------ This transformation of readable code into spaghetti code gets done by starting with the most outer control flow statement, describing its meaning with a flowchart and then expressing it with spaghetti code. You then can continue by repeating the procedure for nested controlstructures etc. So we begin with the do-statement in the bubble sort code which can be described and rewritten as follows: + ---- CODE -------------------------------------------------------------------- do { /* ... */ } while (swapped); ------------------------------------------------------------------------------ + ---- TIKZ -------------------------------------------------------------------- \begin{tikzpicture} \input{flowchart.tex} \renewcommand\BoxWidth {5} \renewcommand\BoxHeight {1.2} \renewcommand\BoxDistance {1.8} \SetMargin{0.01}{0.01}{1}{2} \PutStatement{0}{swapped = 0} \PutStatement{1}{ ... ;} \PutJump{2}{swapped != 0} \PutFakeStatement{3}{} \PutLabel{0}{bubblesort\_do} \AddPath{0}{1} \AddPath{1}{2} \AddPath{2}{3} \AddCondJumpPath{2}{0} \end{tikzpicture} ------------------------------------------------------------------------------ + ---- CODE -------------------------------------------------------------------- bubblesort_do: /* ... */ if (swapped) goto bubblesort_do; ------------------------------------------------------------------------------ The next inner control flow statement is the for-loop: + ---- CODE -------------------------------------------------------------------- for (i=1; i= n} \PutStatement{2}{ /* ... */ } \PutStatement{3}{ i = i + 1;} \PutJump{4}{} \PutStatement{5}{/* empty statement*/ ;} \PutLabel{1}{bubblesort\_for} \PutLabel{5}{bubblesort\_endfor} \AddPath{0}{1} \AddPath{1}{2} \AddPath{2}{3} \AddPath{3}{4} \AddCondJumpPath{1}{5} \AddJumpPathLeft{4}{1} \end{tikzpicture} ------------------------------------------------------------------------------ And the most inner control stament can be rewritten as follows + ---- CODE -------------------------------------------------------------------- i = 1; bubblesort_for: if (i >= n) goto bubblesort_endfor; /* ... */ i = i + 1; bubblesort_endfor: ; ------------------------------------------------------------------------------ Finally we reached the most inner control statement: + ---- CODE -------------------------------------------------------------------- if (A[i-1] > A[i]) { /* ... */ } ------------------------------------------------------------------------------ + ---- TIKZ -------------------------------------------------------------------- \begin{tikzpicture} \input{flowchart.tex} \renewcommand\BoxWidth {5} \renewcommand\BoxHeight {1.2} \renewcommand\BoxDistance {1.8} \SetMargin{0.01}{0.01}{1}{2} \PutJump{0}{A[i-1] <= A[i]} \PutStatement{1}{ /* ... */ } \PutStatement{2}{ /* empty statement */ ;} \PutLabel{2}{bubblesort\_endif} \AddPath{0}{1} \AddPath{1}{2} \AddCondJumpPath{0}{2} \end{tikzpicture} ------------------------------------------------------------------------------ + ---- CODE -------------------------------------------------------------------- if (A[i-1] <= A[i]) goto bubblesort_endif; /* ... */ bubblesort_endif: ; ------------------------------------------------------------------------------ Why is it called spaghetti code? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I think you already got the idea. But nonetheless, have a look at the corresponding flowchart: + ---- TIKZ -------------------------------------------------------------------- \begin{tikzpicture} \input{flowchart.tex} \renewcommand\BoxWidth {5} \renewcommand\BoxHeight {1.2} \renewcommand\BoxDistance {1.8} \SetMargin{3}{1}{1}{5} \PutStatement{0}{ swapped = 0; } \PutStatement{1}{ i = 1; } \PutJump{2}{i >= n} \PutJump{3}{A[i-1] <= A[i]} \PutStatement{4}{tmp = A[i];} \PutStatement{5}{A[i] = A[i-1];} \PutStatement{6}{A[i-1] = tmp;} \PutStatement{7}{swapped = 1;} \PutStatement{8}{i = i + 1;} \PutJump{9}{} \PutStatement{10}{n = n - 1;} \PutJump{11}{swapped != 0} \PutFakeStatement{12}{} \PutLabel{0}{bubblesort\_do} \PutLabel{2}{bubblesort\_for} \PutLabel{8}{bubblesort\_endif} \PutLabel{10}{bubblesort\_endfor} \AddPath{0}{1} \AddPath{1}{2} \AddPath{2}{3} \AddPath{3}{4} \AddPath{4}{5} \AddPath{5}{6} \AddPath{6}{7} \AddPath{7}{8} \AddPath{8}{9} \AddPath{10}{11} \AddPath{11}{12} \AddCondJumpPath{2}{10} \begingroup \renewcommand\JumpPathDist {2} \AddCondJumpPathLeft{3}{8} \par\endgroup \begingroup \renewcommand\JumpPathDist {2.5} \AddJumpPathLeft{9}{2} \par\endgroup \begingroup \renewcommand\JumpPathDist {3} \AddCondJumpPathLeft{11}{0} \par\endgroup \end{tikzpicture} ------------------------------------------------------------------------------ + ---- CODE -------------------------------------------------------------------- void bubblesort(char *A, uint64_t n) { uint64_t swapped; uint64_t i; char tmp; bubblesort_do: swapped = 0; i = 1; bubblesort_for: if (i >= n) goto bubblesort_endfor; if (A[i-1] <= A[i]) goto bubblesort_endif; tmp = A[i]; A[i] = A[i-1]; A[i-1] = tmp; swapped = 1; bubblesort_endif: i = i + 1; goto bubblesort_for; n = n - 1; bubblesort_endfor: if (swapped) goto bubblesort_do; } ------------------------------------------------------------------------------ Complete spaghetti code ~~~~~~~~~~~~~~~~~~~~~~~ Here the complete beast: :import: session10/func/bubblesort_spaghetti.c [fold] ---- SHELL (path=session10/func) ----------------------------------------------- gcc -Wno-builtin-declaration-mismatch -o bubblesort bubblesort_spaghetti.c bubblesort -------------------------------------------------------------------------------- Implementing the bubble sort example in assembly for the ULM ------------------------------------------------------------ :import: session10/func/bubblesort.s [fold] ---- SHELL (path=session10/func) ----------------------------------------------- ulmas -o bubblesort bubblesort.s ulm bubblesort -------------------------------------------------------------------------------- Example in C for sorting a string with Quicksort ================================================ :import: session10/func/quicksort.c [fold] Compile and run: ---- SHELL (path=session10/func) ----------------------------------------------- gcc -Wno-builtin-declaration-mismatch -o quicksort quicksort.c quicksort -------------------------------------------------------------------------------- Some C spaghetti code for the Quicksort example =============================================== :import: session10/func/quicksort_spaghetti.c [fold] Compile and run: ---- SHELL (path=session10/func) ----------------------------------------------- gcc -Wno-builtin-declaration-mismatch -o quicksort quicksort_spaghetti.c quicksort -------------------------------------------------------------------------------- Skeleton for the Quicksort assembly program =========================================== For the quiz you can use the following skeleton. Function `quicksort` contains the labels from the corresponding C spaghetti code and the C statements in comments: :import: session10/func/quicksort.s [fold] In its current form it translates but function `quicksort` just returns. Therefore you see string printed twice: ---- SHELL (path=session10/func) ----------------------------------------------- ulmas -o quicksort quicksort.s ulm quicksort -------------------------------------------------------------------------------- :links: Quicksort -> https://en.wikipedia.org/wiki/Quicksort bubble sort -> https://en.wikipedia.org/wiki/Bubble_sort spaghetti code -> https://en.wikipedia.org/wiki/Spaghetti_code The Mistakes I Made As a Beginner Programmer -> https://medium.com/edge-coders/the-mistakes-i-made-as-a-beginner-programmer-ac8b3e54c312 Russell Hadley: The Route to C\+\+ Code Optimization -> https://channel9.msdn.com/Shows/Going+Deep/Russell-Hadley-The-Route-to-C-Code-Optimization