=========================================== Assembly: Loading a Literal into a Register [TOC] =========================================== ---- VIDEO ------------------------------ https://www.youtube.com/embed/-7a6L675GCI ----------------------------------------- New `subq` Instruction: Fixing a Problem in `print_uint64` ========================================================== Currently the instruction ---- CODE (type=s) ------------------------------------------------------------- subq .print_uint64.buf, %p, %0 -------------------------------------------------------------------------------- is specified in the instruction set as follows ---- CODE (type=txt) ----------------------------------------------------------- RRR (OP u 8) (X u 8) (Y u 8) (Z u 8) # ... 0x05 RRR : subq X, %Y, %Z ulm_sub64(X, ulm_regVal(Y), Z); -------------------------------------------------------------------------------- Hence, for the literal `.print_uint64.buf` only 8 bits are available. Because of that we will run sooner or later into a problem with our current implementation of `print_uint64`. Because of this code fragment: ---- CODE (type=s) ------------------------------------------------------------- print_uint64 // ... .bss .print_uint64.buf: .space 20 .text // ... subq .print_uint64.buf, %p, %0 // ... ret %RET_ADDR -------------------------------------------------------------------------------- If the size of our program grows the text segment soon will exceed 256 bytes. That means the addresses of the data and BSS segment can no longer be encode with 8 bits. In order to fix that we need two things: - Another instruction for subtraction where we can subtract a register `%X` from `%Y` ---- CODE (type=txt) --------------------------------------------------------- 0x18 RRR : subq %X, %Y, %Z ulm_sub64(ulm_regVal(X), ulm_regVal(Y), Z); ------------------------------------------------------------------------------ - And fix the code of `print_uint64` so that the literal `.print_uint64.buf` will be available in some register. Using the `@w[0-3]` operators we could achieve this as follows: ---- CODE (type=s) ----------------------------------------------------------- print_uint64 .data .equ val, PARAM0 .equ digit, CALLEE1 .equ p, CALLEE2 .equ buf, CALLEE3 .bss .print_uint64.buf: .space 20 .text # load .print_uint64.buf into %p ldzwq @w3(.print_uint64.buf), %p shldwq @w2(.print_uint64.buf), %p shldwq @w1(.print_uint64.buf), %p shldwq @w0(.print_uint64.buf), %p # copy %p to %buf movq %p, %buf # subtract .print_uint64.buf (stored in %buf) from %p subq %buf, %p, %0 // ... ret %RET_ADDR ------------------------------------------------------------------------------ Using always four instructions (with one `ldzwq` and three `shldwq`) is inconvenient in handwritten code. And it often is unnecessary probably our addresses in the text, data and BSS segment will always fit into 16 bits. But hoping that that 16-bit will just is like calling for __hit me again__. :links: hit me again -> https://youtu.be/rVV0Cty4lMw The clean solution would be to use a literal pool. As it just contains one literal we can simply that bookkeeping a bit: ---- CODE (type=s) ----------------------------------------------------------- print_uint64 .data .equ val, PARAM0 .equ digit, CALLEE1 .equ p, CALLEE2 .equ buf, CALLEE3 .bss .print_uint64.buf: .space 20 .text # load .print_uint64.buf into %p ldpa .print_uint.pool.buf, %p ldfp 0(%p), %p # copy %p to %buf movq %p, %buf # subtract .print_uint64.buf (stored in %buf) from %p subq %buf, %p, %0 // ... ret %RET_ADDR .align 8 .print_uint.pool.buf: .quad .print_uint64.buf ------------------------------------------------------------------------------ Of course want to support the special case where the displacemane _Y_ in _ldfp Y(%X), %Z_ equals zero in a more convenient way. Like for _movq Y(%X), %Z_ we simply provide in the instruction set an alternative where _Y_ is %skipped: ---- CODE (type=txt) ----------------------------------------------------------- 0x17 RRR : ldfp Y(%X), %Z : ldfp (%X), %Z ulm_fetch64(Y * 8, X, 0, 0, ULM_ZERO_EXT, 8, Z); -------------------------------------------------------------------------------- Now we can change in function _print_uint64_ the line ---- CODE (type=s) ------------------------------------------------------------- ldfp 0(%p), %p -------------------------------------------------------------------------------- to ---- CODE (type=s) ------------------------------------------------------------- ldfp (%p), %p -------------------------------------------------------------------------------- Provided Material ================= Here an __ULM Instruction Set__ and its `isa.txt` source code that contains all the instructions shown in the video (and for _ldfp_ the alternative with a zero displacement): :import: session13/load64/0_ulm_variants/load64/isa.txt [fold] ---- SHELL (path=session13/load64/, hide) -------------------------------------- make make refman mkdir -p /home/www/htdocs/numerik/hpc/ss22/hpc0/session13/load64/ cp 1_ulm_build/load64/refman.pdf /home/www/htdocs/numerik/hpc/ss22/hpc0/session13/load64/ -------------------------------------------------------------------------------- :links: ULM Instruction Set -> https://www.mathematik.uni-ulm.de/numerik/hpc/ss22/hpc0/session13/load64/refman.pdf Quiz 13: Computing the greatest common divisor ============================================== Write a program `gcd.s` that computes for two 64-bit unsigned integers the greatest common divisor (gcd). Provide the user a nice experience, i.e. using the program looks like this: ---- CODE (type=txt) ----------------------------------------------------------- theon$ 1_ulm_build/load64/ulm gcd a = 18 b = 12 gcd(18, 12) = 6 theon$ 1_ulm_build/load64/ulm gcd a = 350982 b = 822647 gcd(350982, 822647) = 527 -------------------------------------------------------------------------------- Use the following algorithm for your implementation: --- TIKZ --------------------------------------------------------------------- \begin{adjustbox}{} \textcolor{white}{.} \begin{varwidth}{10cm} \begin{algorithmic} \Function{gcd}{$a, b$} \If{$a = 0 \;\lor\ b=0$} \State \textbf{return} $0$ \EndIf \While{$a\not=b$} \If{$a > b$} \State $a\gets a - b$ \Else \State $b\gets b - a$ \EndIf \EndWhile \State \textbf{return} $b$ \EndFunction \end{algorithmic} \end{varwidth} \end{adjustbox} ------------------------------------------------------------------------------ All labels are in general 64-bit literals. If you require a 64-bit literal as an absolute address use a literal pool. So in particular also fix the problem in `print_uint64` as outlined above. Your program should have an exit code of 0. Submit your program with ---- CODE (type=txt) ---------- submit hpc quiz13 isa.txt gcd.s -------------------------------