================= GEMM Macro Kernel [TOC] ================= The GEMM macro kernel is supposed to perform the GEMM operation ---- LATEX --------------------------------------------------------------------- C \leftarrow \beta \cdot C + \alpha \cdot A \cdot B -------------------------------------------------------------------------------- for the following special case: - $C$ is a $m_c \times n_c$ matrix. - Matrices $A$ and $B$ are given in a packed storage format: - Originally matrix $A$ was a $m_c \times k_c$ matrix. It was partitioned into horizontal $M_r \times k_c$ panels and these panels were packed into buffer. The macro kernel has only access to this buffer. Eventually the last panel was zero padded. - Originally matrix $B$ was a $k_c \times n_c$ matrix. It was partitioned into vertical $k_c \times N_r$ panels and these panels were packed into buffer. The macro kernel has only access to this buffer. Eventually the last panel was zero padded. The macro kernel performs its operation by using the micro kernel (multiplying panels of $A$ with panels of $B$). Exercise ======== - Implement function `dgemm_macro` in the test program below. Before you start coding: Make yourself familiar with the test program below. Simple Test Program =================== :import: session16/simple_test_macro_ex.c