Packing blocks of B
Content |
Partitioning of matrix \(B\)
Matrix \(B\) has dimensions \(k \times n\) and will be partitioned with respect to some constants \(K_c\) and \(N_c\). This means that \(B\) is seen as a \(k_b \times n_b\) block-matrix:
\[B = \left(\begin{array}{c|c|c} B_{0,0} & \dots & B_{0,n_b-1} \\ \hline \vdots & & \vdots \\ \hline B_{k_b-1,0} & \dots & B_{k_b-1,n_b-1} \\ \end{array}\right),\quad k_b = \left\lceil \frac{k}{K_c} \right\rceil,\; n_b = \left\lceil \frac{n}{N_c} \right\rceil.\]Each block has maximal dimensions \(K_c \times N_c\). However, blocks at the bottom or on the right border can be smaller. More precisely: For a block
\[B_{i,j} \in \mathbb{M}^{K\times N} \quad (0 \leq i < k_b, \; 0 \leq j < n_b)\]dimensions are given by
\[K = \begin{cases} K_c, & i < k_b-1 \;\; \lor \;\; K_c \,|\, k, \\ k \bmod K_c, & \text{else}, \end{cases}\]and
\[N = \begin{cases} N_c, & j < n_b-1 \;\; \lor \;\;N_c \,|\, n, \\ n \bmod N_c, & \text{else}. \end{cases}\]Packing blocks of \(B\)
Blocks of \(B\) are partitioned into vertical panels with width \(N_r\). Besides the different blocking factor \(N_c\) (and different matrix dimension) packing of \(B\) can be done with the previous packing operation applied to \(B^T\).
Exercise
Extend the code to make it work:
#include <cassert> #include <cstddef> #include <printf.hpp> namespace tools { void initMatrix(std::size_t m, std::size_t n, double *A, std::ptrdiff_t incRowA, std::ptrdiff_t incColA) { for (std::size_t j=0; j<n; ++j) { for (std::size_t i=0; i<m; ++i) { A[i*incRowA+j*incColA] = i*n + j + 1; } } } void printMatrix(std::size_t m, std::size_t n, const double *A, std::ptrdiff_t incRowA, std::ptrdiff_t incColA) { for (std::size_t i=0; i<m; ++i) { for (std::size_t j=0; j<n; ++j) { fmt::printf("%6.2lf ", A[i*incRowA+j*incColA]); } fmt::printf("\n"); } fmt::printf("\n"); } struct DoubleArray { DoubleArray(std::size_t n) : ptr(new double[n]) { assert(ptr); } ~DoubleArray() { delete [] ptr; } operator double *() const { return ptr; } double * const ptr; }; } // namespace tools //------------------------------------------------------------------------------ namespace ulmblas { void pack_A(std::size_t M, std::size_t K, const double *A, std::ptrdiff_t incRowA, std::ptrdiff_t incColA, std::size_t M_R, double *p) { /* TODO: your code */ } void pack_B(std::size_t K, std::size_t N, const double *B, std::ptrdiff_t incRowB, std::ptrdiff_t incColB, std::size_t N_R, double *p) { /* TODO: your code */ } } // namespace ulmblas //------------------------------------------------------------------------------ #ifndef DIM_K #define DIM_K 11 #endif #ifndef DIM_N #define DIM_N 10 #endif #ifndef COLMAJOR_B #define COLMAJOR_B 1 #endif std::size_t K_C = 4; std::size_t N_C = 8; std::size_t N_R = 4; //------------------------------------------------------------------------------ int main() { // For the moment we will not use the matrix class from session 7 // 1) Allocate a k x n matrix B std::size_t k = DIM_K; std::size_t n = DIM_N; tools::DoubleArray B(k*n); std::ptrdiff_t incRowB = (COLMAJOR_B) ? 1 : n; std::ptrdiff_t incColB = (COLMAJOR_B) ? k : 1; // 2) Initialize matrix B tools::initMatrix(k, n, B, incRowB, incColB); // 3) Print dimensions of B and content of B fmt::printf("k = %zu, n = %zu\n", k, n); fmt::printf("B = \n"); tools::printMatrix(k, n, B, incRowB, incColB); fmt::printf("K_C = %zu, N_C = %zu, N_R = %zu\n", K_C, N_C, N_R); // 4) Allocate a buffer p of size K_C * N_C tools::DoubleArray p(K_C*N_C); std::size_t k_b = (k+K_C-1) / K_C; std::size_t n_b = (n+N_C-1) / N_C; fmt::printf("B is partitioned into a %zu x %zu block matrix\n", k_b, n_b); for (std::size_t i=0; i<k_b; ++i) { std::size_t K = /* TODO: your code */ for (std::size_t j=0; j<n_b; ++j) { std::size_t N = /* TODO: your code */ fmt::printf("B_{%zu,%zu} is a %zu x %zu matrix\n", i, j, K, N); // 5) Print the content of the matrix block B_{i,j} fmt::printf("B_{%zu,%zu} = \n", i, j); tools::printMatrix(/* TODO: your code */); // 6) Pack block B_{i,j} in buffer p ulmblas::pack_B(/* TODO: your code */); // 7) Print content of buffer p tools::printMatrix(/* TODO: your code */); } } }