=============================================================== General Matrix Matrix Product (GEMM): Simple Cache Optimization [TOC] =============================================================== ---- SHELL (path=session15/, hide) --------------------------------------------- rm -rf gemm mkdir gemm -------------------------------------------------------------------------------- Needed Material =============== For this ecercise we need: - The test program `test_dgemm.c`. You can copy it from `/home/numerik/pub/hpc/ss18/ulmblas/`: ---- SHELL (path=session15/gemm) --------------------------------------------- cp /home/numerik/pub/hpc/ss18/ulmblas/test_dgemm.c . ------------------------------------------------------------------------------ - The auxiliary functions declared in `ulmaux.h` and defined in `ulmaux.c`: ---- SHELL (path=session15/gemm) --------------------------------------------- cp /home/numerik/pub/hpc/ss18/ulmblas/ulmaux.* . ------------------------------------------------------------------------------ - The header file `ulmblas.h`: ---- SHELL (path=session15/gemm) --------------------------------------------- cp /home/numerik/pub/hpc/ss18/ulmblas/ulmblas.h . ------------------------------------------------------------------------------ - The *incomplete* implementation `ulmblas.c` in `/home/numerik/pub/hpc/ss18/ulmblas/session15a`: ---- SHELL (path=session15/gemm) --------------------------------------------- cp /home/numerik/pub/hpc/ss18/ulmblas/session15a/ulmblas.c . ------------------------------------------------------------------------------ Exercise ======== - Implement in `ulmblas.c` function `dgemm`. Use the simple cache optimzation introduced in the lecture. - Check your implementation by running `test_dgemm check`. - Benchmark different case of storage orders. For example, run - `test_dgemm bench colmajorA colmajorB colmajorC` - `test_dgemm bench rowmajorA colmajorB colmajorC` - `test_dgemm bench rowmajorA rowmajorB colmajorC` - ...