========================
Procedures and functions					    [TOC]
========================

For passing arguments and receiving return values we also can use the stack. The
calling convention basically just gets extended to specify where in the stack
these values will be stored. 

Procedures
==========
We just consider the case that a procedure receives two arguments `arg0` and
`arg1` and both have the size of a quad word. Then the callee expects the
following stack in its prologue:

---- TIKZ ----------------------------------------------------------------------
\begin{tikzpicture}
\input{memory.tex}

\renewcommand\MemCellWidth { 0.4}

\DrawMemArrayOpen{-16}{39}

\DrawQuadVariable[red!20]{24}{variable arg1}
\DrawQuadVariable[red!20]{16}{variable arg0}
\DrawQuadVariable[cyan!40]{8}{reserved \%FP}
\DrawQuadVariable[cyan!40]{0}{reserved \%RET}

\DrawMemLabel{32}{\%SP + 32}
\DrawMemLabel{24}{\%SP + 24}
\DrawMemLabel{16}{\%SP + 16}
\DrawMemLabel{8}{\%SP + 8}
\DrawMemLabel{0}{\%SP}

\DrawPointer{0}{\%SP}

\end{tikzpicture}
--------------------------------------------------------------------------------

Calling a procedure
~~~~~~~~~~~~~~~~~~~
This means for calling such a procedure the caller has to reserve space on the
stack so that the callee can save the frame pointer and store its return
address. Furthermore, the stack contains the arguments at known offsets relative
to the stack pointer. So for example, in the code fragment

---- CODE (type=s) -------------------------------------------------------------
    subq    4*8,	%SP,	    %SP
    ldzwq   1,		%4
    movq    %4,		16(%SP)		    // store arg0
    ldzwq   2,		%4
    movq    %4,		24(%SP)		    // store arg1
    ldzwq   proc_foo,	%4
    jmp	    %4,		%RET
    addq    4*8,	%SP,	    %SP
--------------------------------------------------------------------------------

the procedure call `proc_foo(1, 2)` gets described. Using the directives

---- CODE (type=s) -------------------------------------------------------------
    .equ    proc_arg0,	16
    .equ    proc_arg1,	proc_arg0+8
--------------------------------------------------------------------------------

this can be rewritten more readable as

---- CODE (type=s) -------------------------------------------------------------
    subq    4*8,	%SP,	    %SP
    ldzwq   1,		%4
    movq    %4,		proc_arg0(%SP)	    // store arg0
    ldzwq   2,		%4
    movq    %4,		proc_arg1(%SP)	    // store arg1
    ldzwq   proc_foo,	%4
    jmp	    %4,		%RET
    addq    4*8,	%SP,	    %SP
--------------------------------------------------------------------------------

Implementation of a procedure
-----------------------------
The prologue and epilogue are the same a for subprograms. Assume the procedure
uses two local variables `local0` and `local1` then after the prologue

---- CODE (type=s) -------------------------------------------------------------
    // function prologue (with 2 local variables, each 8 bytes)
    movq    %RET,	ret(%SP)
    movq    %FP,	fp(%SP)
    addq    0,		%SP,	    %FP
    subq    2*8,	%SP,	    %SP
--------------------------------------------------------------------------------

the stack can be described by

---- TIKZ ----------------------------------------------------------------------
\begin{tikzpicture}
\input{memory.tex}

\renewcommand\MemCellWidth { 0.4}

\DrawMemArrayOpen{-16}{39}

\DrawQuadVariable[red!20]{24}{variable arg1}
\DrawQuadVariable[red!20]{16}{variable arg0}
\DrawQuadVariable[cyan!40]{8}{reserved \%FP}
\DrawQuadVariable[cyan!40]{0}{reserved \%RET}

\DrawQuadVariable[gray!20]{-8}{variable local0}
\DrawQuadVariable[gray!20]{-16}{variable local1}

\DrawMemLabel{24}{\%FP + 24}
\DrawMemLabel{16}{\%FP + 16}
\DrawMemLabel{8}{\%FP + 8}
\DrawMemLabel{0}{\%FP}
\DrawMemLabel{-8}{\%FP - 8}
\DrawMemLabel{-16}{\%FP - 16}

\DrawPointer{0}{\%FP}
\DrawPointer{-16}{\%SP}

\end{tikzpicture}
--------------------------------------------------------------------------------

Hence, using the above directives the callee can access the arguments by the
memory locations `arg0(%FP)` and `arg1(%FP)`, i.e. compared to the caller the
offsets are after the prologue relative to the frame pointer instead of the
stack pointer:

---- TIKZ ----------------------------------------------------------------------
\begin{tikzpicture}
\input{memory.tex}

\renewcommand\MemCellWidth { 0.4}

\DrawMemArrayOpen{-16}{39}

\DrawQuadVariable[red!20]{24}{arg1(\%FP)}
\DrawQuadVariable[red!20]{16}{arg0(\%FP)}
\DrawQuadVariable[cyan!40]{8}{reserved \%FP}
\DrawQuadVariable[cyan!40]{0}{reserved \%RET}

\DrawQuadVariable[gray!20]{-8}{variable local0}
\DrawQuadVariable[gray!20]{-16}{variable local1}

\DrawMemLabel{24}{\%FP + 24}
\DrawMemLabel{16}{\%FP + 16}
\DrawMemLabel{8}{\%FP + 8}
\DrawMemLabel{0}{\%FP}
\DrawMemLabel{-8}{\%FP - 8}
\DrawMemLabel{-16}{\%FP - 16}

\DrawPointer{0}{\%FP}
\DrawPointer{-16}{\%SP}

\end{tikzpicture}
--------------------------------------------------------------------------------

Using the directives

---- CODE (type=s) -------------------------------------------------------------
    .equ    ret,	0
    .equ    fp,		8

    .equ    proc_arg0,	16
    .equ    proc_arg1,	proc_arg0+8


    .equ    local0,	-8
    .equ    local1,	local0-8
--------------------------------------------------------------------------------

all relevant memory locations can be accessed relative to the frame pointer in a
uniform way:

---- TIKZ ----------------------------------------------------------------------
\begin{tikzpicture}
\input{memory.tex}

\renewcommand\MemCellWidth { 0.4}

\DrawMemArrayOpen{-16}{39}

\DrawQuadVariable[red!20]{24}{arg1(\%FP)}
\DrawQuadVariable[red!20]{16}{arg0(\%FP)}
\DrawQuadVariable[cyan!40]{8}{fp(\%FP)}
\DrawQuadVariable[cyan!40]{0}{ret(\%FP)}

\DrawQuadVariable[gray!20]{-8}{local0(\%FP)}
\DrawQuadVariable[gray!20]{-16}{local1(\%FP)}

\DrawMemLabel{24}{\%FP + 24}
\DrawMemLabel{16}{\%FP + 16}
\DrawMemLabel{8}{\%FP + 8}
\DrawMemLabel{0}{\%FP}
\DrawMemLabel{-8}{\%FP - 8}
\DrawMemLabel{-16}{\%FP - 16}

\DrawPointer{0}{\%FP}
\DrawPointer{-16}{\%SP}

\end{tikzpicture}
--------------------------------------------------------------------------------


Functions
=========
Functions receive arguments like procedures but in addition need also a memory
location on the stack to give back a return value.  Again, we first consider a
function that receives just two arguments. Then the callee expects in its
prologue the following stack:

--- TIKZ ----------------------------------------------------------------------
\begin{tikzpicture}
\input{memory.tex}

\renewcommand\MemCellWidth { 0.4}

\DrawMemArrayOpen{-16}{39}

\DrawQuadVariable[red!20]{32}{variable arg1}
\DrawQuadVariable[red!20]{24}{variable arg0}
\DrawQuadVariable[cyan!40]{16}{return value}
\DrawQuadVariable[cyan!40]{8}{reserved \%FP}
\DrawQuadVariable[cyan!40]{0}{reserved \%RET}

\DrawMemLabel{32}{\%FP + 32}
\DrawMemLabel{24}{\%FP + 24}
\DrawMemLabel{16}{\%FP + 16}
\DrawMemLabel{8}{\%FP + 8}
\DrawMemLabel{0}{\%FP}
\DrawMemLabel{-8}{\%FP - 8}
\DrawMemLabel{-16}{\%FP - 16}


\DrawPointer{0}{\%SP}

\end{tikzpicture}
--------------------------------------------------------------------------------

Calling a function
~~~~~~~~~~~~~~~~~~

Compared to calling a procedure the caller has to receive 8 more bytes for the
return value, and furthermore the displacements for the arguments have to be
adapted. So for example, in the code fragment

---- CODE (type=s) -------------------------------------------------------------
    subq    5*8,	%SP,	    %SP
    ldzwq   1,		%4
    movq    %4,		24(%SP)		    // store arg0
    ldzwq   2,		%4
    movq    %4,		32(%SP)		    // store arg1
    ldzwq   func_foo,	%4
    jmp	    %4,		%RET
    addq    5*8,	%SP,	    %SP
--------------------------------------------------------------------------------

the fucntion call `func_foo(1, 2)` gets described. Using the directives

---- CODE (type=s) -------------------------------------------------------------
    .equ    func_arg0,	24
    .equ    func_arg1,	func_arg0+8
--------------------------------------------------------------------------------

this can be rewritten more readable as

---- CODE (type=s) -------------------------------------------------------------
    subq    4*8,	%SP,	    %SP
    ldzwq   1,		%4
    movq    %4,		func_arg0(%SP)	    // store arg0
    ldzwq   2,		%4
    movq    %4,		func_arg1(%SP)	    // store arg1
    ldzwq   func_foo,	%4
    jmp	    %4,		%RET
    addq    4*8,	%SP,	    %SP
--------------------------------------------------------------------------------


Implementation of a function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The prologue and epilogue are the same a for subprograms or procedure. Assume
the function uses two local variables `local0` and `local1` then after the
prologue

---- CODE (type=s) -------------------------------------------------------------
    // function prologue (with 2 local variables, each 8 bytes)
    movq    %RET,	ret(%SP)
    movq    %FP,	fp(%SP)
    addq    0,		%SP,	    %FP
    subq    2*8,	%SP,	    %SP
--------------------------------------------------------------------------------

the stack can be described by

--- TIKZ ----------------------------------------------------------------------
\begin{tikzpicture}
\input{memory.tex}

\renewcommand\MemCellWidth { 0.4}

\DrawMemArrayOpen{-16}{39}

\DrawMemVariable[red!40]{-8}{0}{Used}
\DrawQuadVariable[red!20]{32}{variable arg1}
\DrawQuadVariable[red!20]{24}{variable arg0}
\DrawQuadVariable[cyan!40]{16}{for return value}
\DrawQuadVariable[cyan!40]{8}{\%FP}
\DrawQuadVariable[cyan!40]{0}{\%RET}
\DrawQuadVariable[gray!20]{-8}{variable local0}
\DrawQuadVariable[gray!20]{-16}{variable local1}

\DrawMemLabel{32}{\%FP + 32}
\DrawMemLabel{24}{\%FP + 24}
\DrawMemLabel{16}{\%FP + 16}
\DrawMemLabel{8}{\%FP + 8}
\DrawMemLabel{0}{\%FP}
\DrawMemLabel{-8}{\%FP - 8}
\DrawMemLabel{-16}{\%FP - 16}


\DrawPointer{0}{\%FP}
\DrawPointer{-16}{\%SP}

\end{tikzpicture}
--------------------------------------------------------------------------------


Using the directives

---- CODE (type=s) -------------------------------------------------------------
    .equ    ret,	0
    .equ    fp,		8
    .equ    rval,	16

    .equ    func_arg0,	16
    .equ    func_arg1,	proc_arg0+8


    .equ    local0,	-8
    .equ    local1,	local0-8
--------------------------------------------------------------------------------

as before all relevant memory locations can be accessed relative to the frame
pointer in the same uniform way

---- TIKZ ----------------------------------------------------------------------
\begin{tikzpicture}
\input{memory.tex}

\renewcommand\MemCellWidth { 0.4}

\DrawMemArrayOpen{-16}{39}

\DrawMemVariable[red!40]{-8}{0}{Used}
\DrawQuadVariable[cyan!40]{32}{arg1(\%FP)}
\DrawQuadVariable[cyan!40]{24}{arg0(\%FP)}
\DrawQuadVariable[cyan!40]{16}{rval(\%FP)}
\DrawQuadVariable[cyan!40]{8}{fp(\%FP)}
\DrawQuadVariable[cyan!40]{0}{ret(\%FP)}
\DrawQuadVariable[gray!20]{-8}{local0(\%FP)}
\DrawQuadVariable[gray!20]{-16}{local1(\%FP)}

\DrawPointer{0}{\%FP}
\DrawPointer{-16}{\%SP}

\end{tikzpicture}
--------------------------------------------------------------------------------


Recipe for the general calling convention
=========================================
From the above a general pattern for calling and implementing functions can be
derived and provided in a recipe style. However, this will just cover the case
that all arguments have the size of 8 bytes. This in particular avoids the need
to incorporate alignment restrictions. Other limitations of the recipe are
mentioned in the description.

Directives for accessing memory locations
-----------------------------------------
For calling and implementing functions we can use and extend the following
directives:

---- CODE (type=s) -------------------------------------------------------------
    .equ    ret,	0
    .equ    fp,		8

    .equ    proc_arg0,	16
    .equ    proc_arg1,	proc_arg0+8
    /*
	add proc_arg2, proc_arg3, etc. as needed
    */

    .equ    func_arg0,	24
    .equ    func_arg1,	func_arg0+8
    /*
	add func_arg2, func_arg3, etc. as needed
    */


    .equ    local0,	-8
    .equ    local1,	local0-8
    /*
	add local2, local2, etc. as needed
    */
--------------------------------------------------------------------------------

As the displacement is encoded with a single byte this pattern "only" works if
the number of arguments or local variables does not exceed 32. The ULM C
compiler can deal with such cases.

Calling functions and procedures
--------------------------------


Calling a procedure
~~~~~~~~~~~~~~~~~~~
Replace `NUM_ARGS` with the number of arguments, `PROC_LABEL` with the name of
the procedure, and replace `%4` with an available register in your context:

---- CODE (type=s) -------------------------------------------------------------
    subq    (NUM_ARGS+2)*8, %SP,        %SP
    // load arg0 in %4
    movq    %4,		    proc_arg0(%SP)      // store arg0
    /*
	store further arguments arg1, arg2, etc as needed
    */

    ldzwq   PROC_LABEL,	    %4
    jmp     %4,		    %RET
    addq    (NUM_ARGS+2)*8, %SP,        %SP
--------------------------------------------------------------------------------

Calling a function
~~~~~~~~~~~~~~~~~~
Like for procedures just replace `NUM_ARGS` and `PROC_LABEL` below. Using this
code fragment the return value will be afterwards stored in `%4` (which can be
replaced by any available register in you context): 

---- CODE (type=s) -------------------------------------------------------------
    subq    (NUM_ARGS+3)*8, %SP,        %SP
    // load arg0 in %4
    movq    %4,		    func_arg0(%SP)      // store arg0
    /*
	store further arguments arg1, arg2, etc as needed
    */

    ldzwq   PROC_LABEL,	    %4
    jmp     %4,		    %RET

    // fetch return value in %4
    movq    rval(%SP),	    %4

    addq    (NUM_ARGS+3)*8, %SP,        %SP
--------------------------------------------------------------------------------


Implementing functions and procedures
-------------------------------------

Skeleton
~~~~~~~~
In the following skeleton replace `FUNC_OR_PROC_LABEL` with the function or
procedure name, and replace `NUM_LOCALS` with the number of local variables:

---- CODE (type=s) -------------------------------------------------------------
FUNC_OR_PROC_LABEL:
    // function prologue
    movq	%RET,		ret(%SP)
    movq	%FP,		fp(%SP)
    addq	0,		%SP,		%FP

    // reserve space for local variables.
    subq	NUM_LOCALS*8,	%SP,		%SP

    // begin of the function body

    /*

	Implementation of the function or procedure

    */

    // end of the function body

    // function epilogue
    addq	0,		%FP,		%SP
    movq	fp(%SP),	%FP
    movq	ret(%SP),	%RET
    jmp		%RET,		%0
--------------------------------------------------------------------------------

Directives for argument names and local variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You don't want to use names like `proc_arg0` for the arguments your procedure
receives or `local0` for its local variables? No problem, use directives for a
more suitable naming , e.g.

---- CODE (type=s) -------------------------------------------------------------
    .equ	a,	proc_arg0
    .equ	b,	proc_arg1

    .equ	i,	local0
    .equ	j,	local1
--------------------------------------------------------------------------------

or analogously for functions, e.g.

---- CODE (type=s) -------------------------------------------------------------
    .equ	a,	func_arg0
    .equ	b,	func_arg1

    .equ	i,	local0
    .equ	j,	local1
--------------------------------------------------------------------------------