General Matrix Matrix Product (GEMM): Simple Cache Optimization
Content |
Needed Material
For this ecercise we need:
-
The test program test_dgemm.c. You can copy it from /home/numerik/pub/hpc/ss18/ulmblas/:
thales$ cp /home/numerik/pub/hpc/ss18/ulmblas/test_dgemm.c . thales$
-
The auxiliary functions declared in ulmaux.h and defined in ulmaux.c:
thales$ cp /home/numerik/pub/hpc/ss18/ulmblas/ulmaux.* . thales$
-
The header file ulmblas.h:
thales$ cp /home/numerik/pub/hpc/ss18/ulmblas/ulmblas.h . thales$
-
The incomplete implementation ulmblas.c in /home/numerik/pub/hpc/ss18/ulmblas/session15a:
thales$ cp /home/numerik/pub/hpc/ss18/ulmblas/session15a/ulmblas.c . thales$
Exercise
-
Implement in ulmblas.c function dgemm. Use the simple cache optimzation introduced in the lecture.
-
Check your implementation by running test_dgemm check.
-
Benchmark different case of storage orders. For example, run
-
test_dgemm bench colmajorA colmajorB colmajorC
-
test_dgemm bench rowmajorA colmajorB colmajorC
-
test_dgemm bench rowmajorA rowmajorB colmajorC
-
...
-