=================================
Testing the matrix-matrix product [TOC]
=================================
Before we implement a GEMM operation with some simple cache optimization, we
need to setup a testing framework. We first state a simple criterion for the
numerical comparison of two GEMM results. See __Testing GEMM__ for more
background on this topic.
Recall that the GEMM operation
---- LATEX ---------------------------------------------------------------------
C \leftarrow \beta C + \alpha A B
--------------------------------------------------------------------------------
overwrites $C$ with the result of the operation. In the following we will
denote
- with $C_0$ the initial value of $C$ and
- with $\widehat{C}$ a trusted result of the GEMM operation.
For a matrix $C$ produced by another GEMM operation we then require that
---- LATEX ---------------------------------------------------------------------
\frac{\|C - \widehat{C}\,\|_\infty}{
\text{eps}\cdot\left(
\max\{m,n,k\} \cdot
|\alpha| \cdot
\|A\|_\infty \|B\|_\infty + |\beta|\cdot\|C_0\|_\infty
\right)}
\leq \tau = \mathcal{O}(1)
--------------------------------------------------------------------------------
holds. In practice a test suite will have a certain default value
(e.g. 1) for the threshold $\tau$ that can be user-defined through compiler
flags.
:links: __Testing GEMM__ -> http://www.netlib.org/utk/people/JackDongarra/WEB-PAGES/Batched-BLAS-2016/Day1/02_Mawussi_BBLAS_testing.pdf
Exercise: Implement some tools for the error estimator
======================================================
We will need an implementation for each of the operations below. If possible,
in each operation the elements should be accessed cache friendly (so we can use
the as efficient building blocks for other operations):
- `dgeaxpy`: Computing the update $B \leftarrow \alpha A + B$ for $m \times n$
matrices $A$ and $B$.
Note: If $B$ is stored in column major then elements of both matrices should
be accessed column-wise. Otherwise elements should be accessed only
row-wise.
- `dgescal`: Compute the scaling $A \leftarrow \alpha A$ for a $m \times n$
matrix.
Make sure that special cases for $\alpha$ are treated efficient and correct.
- `dgecopy`: Copy matrices, i.e. $B \leftarrow A$ for two $m \times n$ matrices.
Note: If $B$ is stored in column major then elements of both matrices should
be accessed column-wise. Otherwise elements should be accessed only
row-wise.
- `dgenorm_inf`: Computing the infinity-norm of an $m \times n$ matrix.
Testing the tools
=================
Use the following skeleton for your implementation and for testing. But make
sure that you have a good understanding of this code:
- What cases for the storage order of $A$ and $B$ are tested?
- How could all possible cases for the storage order be tested?
Also note that function `initMatrix` was modified:
- Elements are accessed cache friendly (you can use this pattern!).
- If the additional argument `withNan` is `true`, the matrix gets initialized
with _NaN_ (not a number) entries.
- Otherwise the matrix gets initalized with random values.
:import: session05/solution/norm_ex.c
:navigate: up -> doc:index
next -> doc:session05/page02