============================================ GEMV: Computation based on vector operations [TOC] ============================================ In the last session, you implemented the BLAS operation _gemv_ for a matrix vector product of the form ---- LATEX -------------------------- y \leftarrow \beta\, y + \alpha\, A x ------------------------------------- The performance of the implementation was independent of the storage order (row-major or col-major). This was achieved through the following algorithm: ---- BOX -------------------------------------------------- = $\mathbf{if}\; \beta \neq 1$ = $\mathbf{for}\; i=1,\dots,m$ = $y_i \leftarrow \beta\, y_i$ = $\mathbf{if}\; \text{incRow}_\text{A} > \text{incCol}_\text{A}$ = $\mathbf{for}\; i=1,\dots,m$ = $\mathbf{for}\; j=1,\dots,n$ = $y_i \leftarrow y_i + \alpha\, a_{i,j} x_j$ = $\mathbf{else}$ = $\mathbf{for}\; j=1,\dots,n$ = $\mathbf{for}\; i=1,\dots,m$ = $y_i \leftarrow y_i + \alpha\, a_{i,j} x_j$ ----------------------------------------------------------- This algorithm describes the matrix and vector operations _component-wise_. Describing Algorithms in terms of vector operations =================================================== We first define some basic vector operation (these are so called BLAS level 1 operations): - The _dot_ product: $\alpha \leftarrow x^T y$ - The _scal_ operation: $x \leftarrow \alpha \; x$ - The _axpy_ operation (alpha x plus y): $y \leftarrow y + \alpha \; x$ Using these as building blocks, we can describe the algorithmic variants in terms of these vector operations. Both variants have in common that at first vector $y$ gets scaled by $\beta$, i.e. ---- LATEX --------- y \leftarrow \beta y -------------------- So what remains to be done is merely computing ---- LATEX ------------ y \leftarrow \alpha A x ----------------------- Variant 1 --------- In the first variant we consider the rows of $A$, i.e. ---- LATEX --------------------------------- A = \begin{pmatrix} \mathbf{a}_1^T \\ \vdots \\ \mathbf{a}_m^T \end{pmatrix} -------------------------------------------- So $A$ is a column vector where each element itself is a row vector. Hence, the operation $y \leftarrow \beta y + \alpha A x$ is equivalent to ---- LATEX --------------------------------- \begin{pmatrix} y_1 \\ \vdots \\ y_m \end{pmatrix} \leftarrow \beta \begin{pmatrix} y_1 \\ \vdots \\ y_m \end{pmatrix} + \alpha \begin{pmatrix} \mathbf{a}_1^T x \\ \vdots \\ \mathbf{a}_m^T x \end{pmatrix} -------------------------------------------- Using the _dot_ operation, we can compute the matrix vector product with ---- BOX -------------------------------------------------- = $\mathbf{for}\; i=1,\dots,m$ = $y_i \leftarrow y_i + \alpha\, \mathbf{a}_i^T x$ ----------------------------------------------------------- Variant 2 --------- We now consider the columns of $A$, i.e. ---- LATEX --------------------------------- A = \begin{pmatrix} \mathbf{a}_1 & \dots & \mathbf{a}_n \end{pmatrix} -------------------------------------------- So $A$ is a row vector where each element itself is a column vector. Hence, the operation $y \leftarrow \beta\, y + \alpha A x$ is equivalent to ---- LATEX --------------------------------- y \leftarrow \beta\, y + \alpha\, \mathbf{a}_1 x_1 + \dots + \alpha \,\mathbf{a}_n x_n -------------------------------------------- So computing $y \leftarrow y + \alpha\, A x$ is equivalent to ---- BOX -------------------------------------------------------- = $\text{For}\; j=1,\dots,n$ = $y \leftarrow y + \alpha\, \mathbf{a}_{j} x_j$ ----------------------------------------------------------------- Combining both variants ======================= Obviously variant 1 performs better for row-major and variant 2 better for col-major matrices. In dependence of the increments we can switch to the more favorable variant: ---- BOX -------------------------------------------------------- = $y \leftarrow \beta\, y$ = $\mathbf{if}\; \text{incRow}_\text{A} > \text{incCol}_\text{A}$ = $\mathbf{for}\; i=1,\dots,m$ = $y_i \leftarrow y_i + \alpha\, \mathbf{a}_i^T x$ = $\mathbf{else}$ = $\mathbf{for}\; j=1,\dots,n$ = $y \leftarrow y + \alpha\, \mathbf{a}_{j} x_j$ ----------------------------------------------------------------- Exercise ======== - Describe algorithms for the BLAS Level 1 operations: _scal_, _dot_ and _axpy_. - Implement these algorithms for vectors of type _double_. Use the function names dscal_ulm, ddot_ulm und daxpy_ulm. - Implement the algorithm that combines both variants. Material ======== In the computer lab E.44 we installed a free version of Intel MKL. We will use the Intel MKL for comparing the performance of our _gemv_ implementation. Solution for session 4 with Intel MKL Benchmark ----------------------------------------------- :import: session06/page01/gemv.c Makefile (for the lab in E.44) ------------------------------ The following makefile creates the two executables dgemv_rowmajor and dgemv_colmajor: - With dgemv_rowmajor we test the case that $A$ is stored row major and - with dgemv_colmajor the case that $A$ is stored col major. In both cases the benchmarks compare our performance with the MKL. :import: session06/page01/Makefile Running the benchmarks ---------------------- ---- SHELL(hostname=heim, path=session06/page01, hide) ------------------------- make clean -------------------------------------------------------------------------------- ---- SHELL(hostname=heim, path=session06/page01) ------------------------------- make ./gemv_colmajor > gemv_colmajor.dat ./gemv_rowmajor > gemv_rowmajor.dat cat gemv_colmajor.dat cat gemv_rowmajor.dat -------------------------------------------------------------------------------- Gnuplot scripts for the benchmarks ---------------------------------- We provide two scripts for plotting the benchmark results: - For plotting the col major case you can use: :import: session06/page01/gemv_colmajor.plot - For plotting the row major case you can use: :import: session06/page01/gemv_rowmajor.plot As usual run these scripts through gnuplot: ---- SHELL(hostname=heim, path=session06/page01) ------------------------------- gnuplot gemv_colmajor.plot gnuplot gemv_rowmajor.plot -------------------------------------------------------------------------------- Plot of the benchmark results ----------------------------- - Here the plot for the col major case: ---- IMAGE ----------------------------- session06/page01/bench_gemv_colmajor.svg ---------------------------------------- - and here for the row major case: ---- IMAGE ----------------------------- session06/page01/bench_gemv_rowmajor.svg ---------------------------------------- :navigate: up -> doc:index next -> doc:session06/page02