======================================= Exercise: GEMM for distributed matrices [TOC] ======================================= Function hpc::matvec::mm is supposed to compute the general matrix-matrix product for distributed matrices. Update for class hpc::mpi::Grid ================================= We extend class hpc::mpi::Grid for row and column communicators using MPI_Comm_split. Assume we have eight nodes within a $2 \times 4$ grid: ---- LATEX --------------------------------------------------------------------- \begin{array}{cccc} (0,0) & (0,1) & (0,2) & (0,3) \\ (1,0) & (1,1) & (1,2) & (1,3) \\ \end{array} -------------------------------------------------------------------------------- MPI_Comm_split will create the communicators commRow and commCol. These will group the different nodes: - commRow connects either: - the nodes where nodeRow is 0, i.e: $(0,0)$, $(0,1)$, $(0,2)$, $(0,3)$ - or nodes where nodeRow is 1, i.e: $(1,0)$, $(1,1)$, $(1,2)$, $(1,3)$ - commCol connects either: - the nodes where nodeCol is 0, i.e: $(0,0)$, $(1,0)$ - or nodes where nodeCol is 1, i.e: $(0,1)$, $(1,1)$ - or nodes where nodeCol is 2, i.e: $(0,2)$, $(1,2)$ - or nodes where nodeCol is 3, i.e: $(0,3)$, $(1,3)$ :import: session27/ex01/grid.hpp Update for class hpc::mpi::GeMatrix ===================================== In the previous session methods like rowOffset or colOffset received a node rank and then computed the position within the grid. This was changed: - Methods that return the row offset or the number of locally stored rows just receive the *node row* and - methods that return the col offset or the number of locally stored cols just receive the *node col* This simplifies the implementation as follows: :import: session27/ex01/gematrix.hpp Update for scatter and gather ============================= The scatter and gather method was adapted to the above modifications: :import: session27/ex01/copy.hpp [fold] Skeleton for hpc::mpi::mm =========================== The skeleton already implements the broadcast of blocks from $A$: :import: session27/ex01/gemm.hpp Exercise ======== - Implement the distributed GEMM operation (You just have to add the broadcasting of blocks from $B$). Test program ============ :import: session27/ex01/test_gemm.cpp You can compile on theon as follows: ---- SHELL (path=session27/ex01) ----------------------------------------------- mpic++ -g -std=c++17 -I. -I/home/numerik/pub/hpc/ws19/session27 +++ -o test_gemm test_gemm.cpp -------------------------------------------------------------------------------- Or at one of the machines in E.44: ---- SHELL (path=session27/ex01,hostname=heim) --------------------------------- OMPI_CXX=g++-8.3 mpic++ -g -std=c++17 -I. -I/home/numerik/pub/hpc/ws19/session27 -o test_gemm test_gemm.cpp ------------------------------------------------------------------------------- :navigate: up -> doc:index next -> doc:session27/page02