================================= Testing the matrix-matrix product [TOC] ================================= Before we implement a GEMM operation with some simple cache optimization, we need to setup a testing framework. We first state a simple criterion for the numerical comparison of two GEMM results. See __Testing GEMM__ for more background on this topic. Recall that the GEMM operation ---- LATEX --------------------------------------------------------------------- C \leftarrow \beta C + \alpha A B -------------------------------------------------------------------------------- overwrites $C$ with the result of the operation. In the following we will denote - with $C_0$ the initial value of $C$ and - with $\widehat{C}$ a trusted result of the GEMM operation. For a matrix $C$ produced by another GEMM operation we then require that ---- LATEX --------------------------------------------------------------------- \frac{\|C - \widehat{C}\,\|_\infty}{ \text{eps}\cdot\left( \max\{m,n,k\} \cdot |\alpha| \cdot \|A\|_\infty \|B\|_\infty + |\beta|\cdot\|C_0\|_\infty \right)} \leq \tau = \mathcal{O}(1) -------------------------------------------------------------------------------- holds. In practice a test suite will have a certain default value (e.g. 1) for the threshold $\tau$ that can be user-defined through compiler flags. :links: __Testing GEMM__ -> http://www.netlib.org/utk/people/JackDongarra/WEB-PAGES/Batched-BLAS-2016/Day1/02_Mawussi_BBLAS_testing.pdf Exercise: Implement some tools for the error estimator ====================================================== We will need an implementation for each of the operations below. If possible, in each operation the elements should be accessed cache friendly (so we can use the as efficient building blocks for other operations): - dgeaxpy: Computing the update $B \leftarrow \alpha A + B$ for $m \times n$ matrices $A$ and $B$. Note: If $B$ is stored in column major then elements of both matrices should be accessed column-wise. Otherwise elements should be accessed only row-wise. - dgescal: Compute the scaling $A \leftarrow \alpha A$ for a $m \times n$ matrix. Make sure that special cases for $\alpha$ are treated efficient and correct. - dgecopy: Copy matrices, i.e. $B \leftarrow A$ for two $m \times n$ matrices. Note: If $B$ is stored in column major then elements of both matrices should be accessed column-wise. Otherwise elements should be accessed only row-wise. - dgenorm_inf: Computing the infinity-norm of an $m \times n$ matrix. Testing the tools ================= Use the following skeleton for your implementation and for testing. But make sure that you have a good understanding of this code: - What cases for the storage order of $A$ and $B$ are tested? - How could all possible cases for the storage order be tested? Also note that function initMatrix was modified: - Elements are accessed cache friendly (you can use this pattern!). - If the additional argument withNan is true, the matrix gets initialized with _NaN_ (not a number) entries. - Otherwise the matrix gets initalized with random values. :import: session05/solution/norm_ex.c :navigate: up -> doc:index next -> doc:session05/page02