# Distributed Matrices

#### Content

Class hpc::mpi:GeMatrix is supposed to represent a $$M \times N$$ matrix where blocks are locally stored on different compute nodes. Hereby the compute nodes are organized in a two dimensional $$m \times n$$ grid. So such matrix $$B$$ is partitioned as follows:

$B = \left(\begin{array}{c|c|c|c} B_{0,0} & B_{0,1} & \dots & B_{0,n-1} \\ \hline \vdots & & & \vdots \\ \hline B_{m-1,0} & B_{0,1} & \dots & B_{m-1,n-1} \end{array}\right)$

The partitioning is quasi-equidistant, such that all inner blocks have the same dimension $$\left\lceil \frac{M}{m} \right\rceil \times \left\lceil \frac{N}{n} \right\rceil$$. This is also the maximal dimension required for storing a local matrix block.

Using this abstraction allows to hide some technical details related to MPI. For example:

• Assume A is of type hpc::matvec::GeMatrix storing the complete $$M \times N$$ matrix on a local (root) node.

• Then for B of type hpc::mpi::GeMatrix representing a distributed matrix

the scatter and gather operations can be carried out by

// Setup A locally on a single node

gecopy(A, B);  // Scatter/distribute the matrix from root node to grid

// Perform some parallel operations on grid

gecopy(B, A);  // Gather/collect blocks into local matrix on root node


## Class hpc::mpi::Grid

Information about the MPI grid and the MPI communicator is encapsulated as an object of type hpc::mpi::Grid in the hpc::mpi::GeMatrix class.

Your first exercise will be the implementation of class hpc::mpi::Grid:

#ifndef HPC_MPI_GRID_HPP
#define HPC_MPI_GRID_HPP 1

#include <mpi.h>

namespace hpc { namespace mpi {

struct Grid
{
Grid()
{
/* TODO: Your code */
}

int         numNodeRows, numNodeCols;
int         nodeRow, nodeCol;
int         nof_processes, rank;

MPI_Comm    comm;
};

} } // namespaces mpi, hpc

#endif // HPC_MPI_GRID_HPP


### Exercise

Implement class hpc::mpi::Grid. In the constructor, setup the two-dimensional grid with MPI_Cart_create(more) from MPI_COMM_WORLD. Also setup the following attributes:

• numNodeRows, numNodeCols are supposed to be the number of grid rows and columns, respectively.

• nodeRow, nodeCol are supposed to contain the row and column position of the process within the grid, respectively.

## Class hpc::mpi::GeMatrix

Each matrix of type hpc::mpi::GeMatrix stores its local matrix block in a matrix of type hpc::matvec::GeMatrix. Information regarding the partitioning can be accessed through the following methods:

• rowOffset(p) internally maps the process id $$p$$ to $$(r,c)$$ where $$r$$ is the node row and $$c$$ the node col in the node grid. It then returns the global row index of the top-left element in $$A_{r,c}$$.

• colOffset(p) internally maps the process id $$p$$ to $$(r,c)$$ where $$r$$ is the node row and $$c$$ the node col in the node grid. It then returns the global col index of the top-left element in $$A_{r,c}$$.

Analogously

• numLocalRows(p) returns the number of rows in block $$A_{r,c}$$,

• numLocalCols(p) returns the number of columns in block $$A_{r,c}$$,

The number of rows and cols of the complete matrix can be accessed through the attributes numRows and numCols.

### Exercise

Use the following skeleton for the implementation:

#ifndef HPC_MPI_GEMATRIX_HPP
#define HPC_MPI_GEMATRIX_HPP 1

#include <cassert>
#include <memory>
#include <type_traits>
#include <mpi.h>
#include <hpc/aux/slices.hpp>
#include <hpc/matvec/gematrix.hpp>
#include <hpc/mpi/fundamental.hpp>
#include "grid.hpp"

namespace hpc { namespace mpi {

template <typename T>
struct GeMatrix
{
GeMatrix(std::size_t numRows, std::size_t numCols, Grid grid)
: numRows(numRows), numCols(numCols),
rows(grid.numNodeRows, numRows),
cols(grid.numNodeCols, numCols),
grid(grid),
buffer(rows.size(grid.nodeRow),cols.size(grid.nodeCol))
{
}

std::size_t
rowOffset(int rank) const
{
// TODO: Your code
}

std::size_t
colOffset(int rank) const
{
// TODO: Your code
}

std::size_t
numLocalRows(int rank) const
{
// TODO: Your code
}

std::size_t
numLocalCols(int rank) const
{
// TODO: Your code
}

std::size_t                  numRows, numCols;
hpc::aux::UniformSlices<int> rows, cols;
Grid                         grid;
hpc::matvec::GeMatrix<T>     buffer;
};

} } // namespaces mpi, hpc

#endif // HPC_MPI_GEMATRIX_HPP


## Exercise: Scatter/Gather

Implement the scatter/gather operations:

• For the scatter operation the node with rank zero sends blocks and for

• the gather operation the node with rank zero receives blocks.

#ifndef HPC_MPI_COPY_HPP
#define HPC_MPI_COPY_HPP 1

#include <cassert>
#include <mpi.h>
#include <hpc/matvec/gematrix.hpp>
#include <hpc/matvec/copy.hpp>
#include "gematrix.hpp"
#include <hpc/mpi/matrix.hpp>

namespace hpc { namespace mpi {

// scatter
template <typename T>
void
copy(const hpc::matvec::GeMatrix<T> &A, GeMatrix<T> &B)
{
// TODO: Your code
}

// gather
template <typename T>
void
copy(const GeMatrix<T> &A, hpc::matvec::GeMatrix<T> &B)
{
// TODO: Your code
}

} } // namespaces mpi, hpc

#endif // HPC_MPI_COPY_HPP


### Program for testing

#include <hpc/matvec/print.hpp>
#include "grid.hpp"
#include "gematrix.hpp"
#include "copy.hpp"
#include <mpi.h>

int
main(int argc, char** argv)
{
MPI_Init(&argc, &argv);

hpc::mpi::Grid          grid;

std::size_t numRows = 10;
std::size_t numCols = 10;

hpc::matvec::GeMatrix<double>    A(numRows, numCols);
hpc::matvec::GeMatrix<double>    B(numRows, numCols);
hpc::mpi::GeMatrix<double>       dA(numRows, numCols, grid);

if (grid.rank==0) {
for (std::size_t i=0; i<numRows; ++i) {
for (std::size_t j=0; j<numCols; ++j) {
A(i,j) = i + j*numRows;
}
}
hpc::matvec::print(A);
}

// scatter
hpc::mpi::copy(A, dA);

// gather
hpc::mpi::copy(dA, B);

if (grid.rank==0) {
hpc::matvec::print(B);
}

MPI_Finalize();
}


You can compile on theon with

theon$mpic++ -g -std=c++17 -I. -I/home/numerik/pub/hpc/ws18/session25 -o test_gematrix test_gematrix.cpp theon$