=================================
Introduction to the MPI interface                            [TOC]
=================================

MPI (Message Passing Interface) is an open standard for a library
interface for parallel programs. In contrast to threads and OpenMP, MPI
does not rely on communication over shared memory. Instead, every thread
is living in its own process with its own virtual memory. This makes
it possible to run MPI applications on a cluster (like Pacioli) or an
arbitrary network of computers.

In 1994, the first release of the standard (1.0) was published, since
1997 we have 2.0. The current standard 3.1 was published in June 2015.
The latter version is fully supported by our installation at Theon, our
Debian hosts in our pool support MPI 2.1. The standards are public at
`http://www.mpi-forum.org/`.

The standard includes language-specific interfaces for Fortran and C.
We use the C interface for our C++ programs.

Multiple open source implementations are available:

 * OpenMPI: `http://www.open-mpi.org/` (on Theon and on our Debian hosts in E.44)

 * MPICH: `http://www.mpich.org/`

 * MVAPICH: `http://mvapich.cse.ohio-state.edu/`
   (supports Infiniband)

We use OpenMPI in our lecture and our sessions.

For compiling and linking the commands `mpiCC` or `mpic++` have to be
used instead of `g++`. MPI programs are not run directly but with the
help of the `mpirun` tool that configures the execution environment.
`mpirun` needs to know how many processes are to be started on which
hosts. This is done using the command line. The total number of
processes is thereby fixed by `mpirun` and cannot be changed when the
MPI application runs.

Before, in our thread-based solutions and in OpenMP, our application
started with a single thread and a parallelization was always explicit.
In MPI all processes are started right from beginning by `mpirun` and
execute (by default) the same program.

As in OpenMP, every process can learn the number of processes in total
and the number of the own process (named _rank_ in MPI). The process
with rank 0 has a special role. This should be the only process that
considers command line arguments and accesses standard input/output.
In case of a master/worker pattern the process with rank 0 naturally
assumes the role of the master.

Following small example demonstrates how the workers send their ranks
using `MPI_Send` to the master while the master receives these message
using `MPI_Recv` and prints them to standard output.

:import:session05/mpi-test.cpp

Remember that all _n_ processes run this program concurrently.
`MPI_Init` must be invoked before any other MPI function is used. This
function connects to the execution environment. At the end, the function
`MPI_Finalize` is to be invoked which severs the link of the process to
its networked execution environment.

The function `MPI_Comm_rank` returns the own process number (rank) and
`MPI_Comm_size` delivers the number of processes. The first parameter
refers in these cases to the communication domain. This is always
`MPI_COMM_WORLD` in our trivial examples which refers to all processes.
We will see later how new communication domains can be created.

The most important operations for communication are `MPI_Send` and
`MPI_Recv`. Both have a similar sequence of parameters:

 * `void* buf`: a pointer to an array whose
   values are to be transfered.
 * `int count`: the size of the array (if
   there is just a single value, 1 is to be given).
 * `MPI_Datatype datatype`: the data type
   of an element of the array. For elementary
   data types fixed types can be used, e.g.
   `MPI_INT` for `int`.
 * `int dest` (for `MPI_Send`) and
   `int source` (for `MPI_Recv`) name the
   recipient or the sender of a message.
 * `int tag`: messages can be tagged, if a non-zero
   value is given, the incoming message must have
   the corresponding tag if it is to be received.
 * `MPI_Comm comm`: the communication domain
   that is to be used for this transfer; this
   is always `MPI_COMM_WORLD` in trivial cases.
 * `MPI_Status* status`: is an extra parameter
   for `MPI_Recv`. This allows to verify
   how many elements of an array have actually
   been sent. This can be 0.

Compilation and execution:

---- SHELL (path=session05,hostname=theon) ------------------------------------
mpiCC -g -std=c++17 -o mpi-test mpi-test.cpp
mpirun -np 4 mpi-test
-------------------------------------------------------------------------------

---- SHELL (path=session05,hostname=heim) -------------------------------------
mpiCC -g -std=c++11 -o mpi-test mpi-test.cpp -Wno-literal-suffix
mpirun -np 4 mpi-test
-------------------------------------------------------------------------------

---- SHELL (path=session05,hostname=heim) -------------------------------------
OMPI_CXX=g++-8.3 mpiCC -g -std=c++17 -o mpi-test mpi-test.cpp -Wno-literal-suffix
mpirun -np 4 mpi-test
-------------------------------------------------------------------------------

You will get a harmless warning on our Debian machines in E.44
unless you suppress them using the option `-Wno-literal-suffix` as
the OpenMPI headers are somewhat outdated for C++11. The _OMPI_CXX_
environment variable allows you to switch to another C++ compiler.

Exercise
========

Extend the trivial example such that a numerical integral
is computed for the function _f_ on the interval $[a, b]$
by using $n$ subintervals:

---- LATEX --------------------------------------------------------------------
S(f,a,b,n) ~=~ \frac{h}{3}
   \left( \frac{1}{2} f(x_0) + \sum_{k=1}^{n-1} f(x_k) +
   2 \sum_{k=1}^{n} f\left(\frac{x_{k-1} + x_k}{2}\right) +
   \frac{1}{2} f(x_n) \right)
-------------------------------------------------------------------------------
with $h ~=~ \frac{b-a}{n}$ and $ x_k ~=~ a + k \cdot h$.

If you use the function $\frac{4}{1 + x^2}$ at the
interval $[0, 1]$, then you could compare the result against
$\pi$. All processes (including the one with rank 0) shall
compute the numerical integral on their respective subinterval
(which is to be divided further). Afterwards, all processes
with a positive rank shall send their results to the master
process with rank 0 which sums up the individual results and prints it
to standard output. `MPI_DOUBLE` is to be used as MPI data type
for `double`.

:navigate: up    -> doc:index
           next  -> doc:session05/page02