Transfer of self-defined data types using MPI

Let us return to the interfaces of MPI_Send and MPI_Recv:

int MPI_Send(void* buf, int count, MPI_Datatype datatype,
             int dest, int tag, MPI_Comm comm);
int MPI_Recv(void* buf, int count, MPI_Datatype datatype,
             int source, int tag, MPI_Comm comm,
             MPI_Status* status);

The 1:1 message interchange operations MPI_Send and MPI_Recv support the transfer of arrays but insist that the array elements are located consecutively in memory. Hence, these operations do not allow us to transfer arbitrary objects of types DenseVectorView or GeMatrixView as the offsets between consecutive elements (which can be configured through inc or incRow and incCol, respectively) can be arbitrary.

MPI allows us to define objects of type MPI_Datatype which allows us to support vector and matrix types which match that of our classes. However, we should first learn a little bit more about data types in MPI.

MPI comes with a set of elementary types \(ET\) which includes, for example, MPI_DOUBLE and MPI_INT. Within the MPI library, a data type \(T\) with the cardinality \(n\) is a sequence of tuples \(\left\{ (et_1, o_1), (et_2, o_2), \dots, (et_n, o_n) \right\}\) where \(et_i \in ET\) and the associated offsets \(o_i \in \mathbb{Z}\) for \(i = 1, \dots, n\). The offsets specify the offset in bytes relatively to the start address.

Two types \(T_1\) and \(T_2\) are considered as compatible to each other for MPI_Send and MPI_Recv if and only if the cardinalities \(n_1\) and \(n_2\) are equal and \(et_{1_i} = et_{2_i}\) for all \(i = 1, \dots, n_1\). Overlapping offsets are permitted for MPI_Send, in case of MPI_Recv their effect is undefined.

Some examples:

As the compatibility of two types \(T_1\) and \(T_2\) is independent from the offsets on the sender and receiver sides, a transfer can be used to convert row to column vectors or to transpose a matrix.

In MPI, multiple functions support the construction of data types. All of them construct a sequence of tuples as described above. Let us start with the type construction function MPI_Type_vector:

int MPI_Type_vector(int count, int blocklength, int stride,
                    MPI_Datatype oldtype, MPI_Datatype* newtype);

The element type is always given as oldtype. The parameter blocklength specifies how many elements of this type are consecutively in memory. They are considered as a block. The offset of two consecutive blocks is specified as stride -- this value is implicitly multiplied with the size of the element type. In total, the data type covers count blocks.

Whenever a data type is constructed using functions like MPI_Type_vector it is to be converted into a flat sequence of tuples using the MPI_Type_commit function before it can be used for actual transfers.

int MPI_Type_commit(MPI_Datatype* datatype);

Exercise

The following source code is already prepared for a small test to transfer vectors. Construct the required MPI data types that allow to transfer a vector with a single MPI_Send and a single MPI_Recv operation.

It is to be recommended to write a small function that constructs a matching MPI_Datatype object for a vector. Such a template function could be written as follows:

template<typename Vector>
typename std::enable_if<hpc::matvec::IsDenseVector<Vector>::value,
   MPI_Datatype>::type
get_type(const Vector& vector) {
   using ElementType = typename Vector::ElementType;
   MPI_Datatype datatype;
   /* ... */
   return datatype;
}

If you are looking for a general approach to map elementary types of C++ to elementary MPI data type objects, you are free to use #include <hpc/mpi/fundamental.h> where you will find the template function get_type which supports all elementary types of MPI. If you have, for example, double x; then hpc::mpi::get_type(x) will return MPI_DOUBLE.

The library for this session is to be found at /home/numerik/pub/hpc/session22 on our hosts.

Source code

#include <cassert>
#include <cstdlib>
#include <mpi.h>
#include <printf.hpp>
#include <hpc/matvec/apply.h>
#include <hpc/matvec/densevector.h>
#include <hpc/matvec/gematrix.h>

int main(int argc, char** argv) {
   MPI_Init(&argc, &argv);

   int nof_processes; MPI_Comm_size(MPI_COMM_WORLD, &nof_processes);
   int rank; MPI_Comm_rank(MPI_COMM_WORLD, &rank);
   assert(nof_processes == 2);

   using namespace hpc::matvec;
   using namespace hpc::mpi;
   using namespace hpc::aux;

   std::size_t nof_rows = 3;
   std::size_t nof_cols = 7;

   if (rank == 0) {
      GeMatrix<double> A(nof_rows, nof_cols, StorageOrder::RowMajor);
      using Index = GeMatrix<double>::Index;
      apply(A, [](double& val, Index i, Index j) -> void {
	 val = i * 100 + j;
      });
      auto row = A.row(2);
      auto col = A.col(0);

      MPI_Send(/* row to rank 1 */);
      MPI_Send(/* col to rank 1 */);

      /* receive it back for verification */
      DenseVector<double> vec1(nof_cols), vec2(nof_rows);
      MPI_Status status;
      MPI_Recv(/* vec1 from rank 1 */);
      MPI_Recv(/* vec2 from rank 1 */);

      /* verify it */
      apply(vec1, [=](double& val, Index i) {
	 if (val != row(i)) {
	    fmt::printf("verification failed for row(%d): %lg vs %lg\n",
	       (int)i, val, row(i));
	 }
      });
      apply(vec2, [=](double& val, Index i) {
	 if (val != col(i)) {
	    fmt::printf("verification failed for col(%d): %lg vs %lg\n",
	       (int)i, val, col(i));
	 }
      });
   } else {
      DenseVector<double> vec1(nof_cols), vec2(nof_rows);
      MPI_Status status;
      MPI_Recv(/* vec1 from rank 0 */);
      MPI_Recv(/* vec2 from rank 0 */);

      /* send it back for verification */
      MPI_Send(/* vec1 to rank 0 */);
      MPI_Send(/* vec2 to rank 0 */);
   }
   MPI_Finalize();
}