2260 lines
88 KiB
Plaintext
2260 lines
88 KiB
Plaintext
[library Boost.MPI
|
|
[authors [Gregor, Douglas], [Troyer, Matthias] ]
|
|
[copyright 2005 2006 2007 Douglas Gregor, Matthias Troyer, Trustees of Indiana University]
|
|
[purpose
|
|
A generic, user-friendly interface to MPI, the Message
|
|
Passing Interface.
|
|
]
|
|
[id mpi]
|
|
[dirname mpi]
|
|
[license
|
|
Distributed under the Boost Software License, Version 1.0.
|
|
(See accompanying file LICENSE_1_0.txt or copy at
|
|
<ulink url="http://www.boost.org/LICENSE_1_0.txt">
|
|
http://www.boost.org/LICENSE_1_0.txt
|
|
</ulink>)
|
|
]
|
|
]
|
|
|
|
[/ Links ]
|
|
[def _MPI_ [@http://www-unix.mcs.anl.gov/mpi/ MPI]]
|
|
[def _MPI_implementations_
|
|
[@http://www-unix.mcs.anl.gov/mpi/implementations.html
|
|
MPI implementations]]
|
|
[def _Serialization_ [@boost:/libs/serialization/doc
|
|
Boost.Serialization]]
|
|
[def _BoostPython_ [@http://www.boost.org/libs/python/doc
|
|
Boost.Python]]
|
|
[def _Python_ [@http://www.python.org Python]]
|
|
[def _MPICH_ [@http://www-unix.mcs.anl.gov/mpi/mpich/ MPICH2]]
|
|
[def _OpenMPI_ [@http://www.open-mpi.org OpenMPI]]
|
|
[def _IntelMPI_ [@https://software.intel.com/en-us/intel-mpi-library Intel MPI]]
|
|
[def _accumulate_ [@http://www.sgi.com/tech/stl/accumulate.html
|
|
`accumulate`]]
|
|
|
|
[/ QuickBook Document version 1.0 ]
|
|
|
|
[section:intro Introduction]
|
|
|
|
Boost.MPI is a library for message passing in high-performance
|
|
parallel applications. A Boost.MPI program is one or more processes
|
|
that can communicate either via sending and receiving individual
|
|
messages (point-to-point communication) or by coordinating as a group
|
|
(collective communication). Unlike communication in threaded
|
|
environments or using a shared-memory library, Boost.MPI processes can
|
|
be spread across many different machines, possibly with different
|
|
operating systems and underlying architectures.
|
|
|
|
Boost.MPI is not a completely new parallel programming
|
|
library. Rather, it is a C++-friendly interface to the standard
|
|
Message Passing Interface (_MPI_), the most popular library interface
|
|
for high-performance, distributed computing. MPI defines
|
|
a library interface, available from C, Fortran, and C++, for which
|
|
there are many _MPI_implementations_. Although there exist C++
|
|
bindings for MPI, they offer little functionality over the C
|
|
bindings. The Boost.MPI library provides an alternative C++ interface
|
|
to MPI that better supports modern C++ development styles, including
|
|
complete support for user-defined data types and C++ Standard Library
|
|
types, arbitrary function objects for collective algorithms, and the
|
|
use of modern C++ library techniques to maintain maximal
|
|
efficiency.
|
|
|
|
At present, Boost.MPI supports the majority of functionality in MPI
|
|
1.1. The thin abstractions in Boost.MPI allow one to easily combine it
|
|
with calls to the underlying C MPI library. Boost.MPI currently
|
|
supports:
|
|
|
|
* Communicators: Boost.MPI supports the creation,
|
|
destruction, cloning, and splitting of MPI communicators, along with
|
|
manipulation of process groups.
|
|
* Point-to-point communication: Boost.MPI supports
|
|
point-to-point communication of primitive and user-defined data
|
|
types with send and receive operations, with blocking and
|
|
non-blocking interfaces.
|
|
* Collective communication: Boost.MPI supports collective
|
|
operations such as [funcref boost::mpi::reduce `reduce`]
|
|
and [funcref boost::mpi::gather `gather`] with both
|
|
built-in and user-defined data types and function objects.
|
|
* MPI Datatypes: Boost.MPI can build MPI data types for
|
|
user-defined types using the _Serialization_ library.
|
|
* Separating structure from content: Boost.MPI can transfer the shape
|
|
(or "skeleton") of complex data structures (lists, maps,
|
|
etc.) and then separately transfer their content. This facility
|
|
optimizes for cases where the data within a large, static data
|
|
structure needs to be transmitted many times.
|
|
|
|
Boost.MPI can be accessed either through its native C++ bindings, or
|
|
through its alternative, [link mpi.python Python interface].
|
|
|
|
[endsect]
|
|
|
|
[section:getting_started Getting started]
|
|
|
|
Getting started with Boost.MPI requires a working MPI implementation,
|
|
a recent version of Boost, and some configuration information.
|
|
|
|
[section:mpi_impl MPI Implementation]
|
|
To get started with Boost.MPI, you will first need a working
|
|
MPI implementation. There are many conforming _MPI_implementations_
|
|
available. Boost.MPI should work with any of the
|
|
implementations, although it has only been tested extensively with:
|
|
|
|
* [@http://www.open-mpi.org Open MPI]
|
|
* [@http://www-unix.mcs.anl.gov/mpi/mpich/ MPICH2]
|
|
* [@https://software.intel.com/en-us/intel-mpi-library Intel MPI]
|
|
|
|
You can test your implementation using the following simple program,
|
|
which passes a message from one processor to another. Each processor
|
|
prints a message to standard output.
|
|
|
|
#include <mpi.h>
|
|
#include <iostream>
|
|
|
|
int main(int argc, char* argv[])
|
|
{
|
|
MPI_Init(&argc, &argv);
|
|
|
|
int rank;
|
|
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
|
|
if (rank == 0) {
|
|
int value = 17;
|
|
int result = MPI_Send(&value, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
|
|
if (result == MPI_SUCCESS)
|
|
std::cout << "Rank 0 OK!" << std::endl;
|
|
} else if (rank == 1) {
|
|
int value;
|
|
int result = MPI_Recv(&value, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
|
|
MPI_STATUS_IGNORE);
|
|
if (result == MPI_SUCCESS && value == 17)
|
|
std::cout << "Rank 1 OK!" << std::endl;
|
|
}
|
|
MPI_Finalize();
|
|
return 0;
|
|
}
|
|
|
|
You should compile and run this program on two processors. To do this,
|
|
consult the documentation for your MPI implementation. With _OpenMPI_, for
|
|
instance, you compile with the `mpiCC` or `mpic++` compiler, boot the
|
|
LAM/MPI daemon, and run your program via `mpirun`. For instance, if
|
|
your program is called `mpi-test.cpp`, use the following commands:
|
|
|
|
[pre
|
|
mpiCC -o mpi-test mpi-test.cpp
|
|
lamboot
|
|
mpirun -np 2 ./mpi-test
|
|
lamhalt
|
|
]
|
|
|
|
When you run this program, you will see both `Rank 0 OK!` and `Rank 1
|
|
OK!` printed to the screen. However, they may be printed in any order
|
|
and may even overlap each other. The following output is perfectly
|
|
legitimate for this MPI program:
|
|
|
|
[pre
|
|
Rank Rank 1 OK!
|
|
0 OK!
|
|
]
|
|
|
|
If your output looks something like the above, your MPI implementation
|
|
appears to be working with a C++ compiler and we're ready to move on.
|
|
[endsect]
|
|
|
|
[section:config Configure and Build]
|
|
|
|
[section:bjam Build Environment]
|
|
|
|
As the rest of Boost, Boost.MPI uses version 2 of the
|
|
[@http://www.boost.org/doc/html/bbv2.html Boost.Build] system for
|
|
configuring and building the library binary.
|
|
|
|
Please refer to the general Boost installation instructions for
|
|
[@http://www.boost.org/doc/libs/release/more/getting_started/unix-variants.html#prepare-to-use-a-boost-library-binary Unix Variant]
|
|
(including Unix, Linux and MacOS) or
|
|
[@http://www.boost.org/doc/libs/1_58_0/more/getting_started/windows.html#prepare-to-use-a-boost-library-binary Windows].
|
|
The simplified build instructions should apply on most platforms with a few specific modifications described below.
|
|
[endsect]
|
|
|
|
[section:bootstraping Bootstrap]
|
|
|
|
As described in the boost installation instructions, go to to root of your Boost source distribution
|
|
and run the `bootstrap` script (`./bootstrap.sh` for unix variants or `bootstrap.bat` for Windows).
|
|
That will generate a 'project-config.jam` file in the root directory.
|
|
Use your favourite text editor and add the following line:
|
|
|
|
using mpi ;
|
|
|
|
Alternatively, you can provided explicitly the list of Boost libraries you want to build.
|
|
Please refer to the `--help` option` of the `bootstrap` script.
|
|
[endsect]
|
|
|
|
[section:mpi_setup Setting up your MPI Implementation]
|
|
|
|
First, you need to scan the =include/boost/mpi/config.hpp= file and check if some
|
|
settings needs to be modified for your MPI implementation or preferences.
|
|
|
|
In particular, the [macroref BOOST_MPI_HOMOGENEOUS] macro, that you will need to comment out
|
|
if you plan to run on an heterogeneous set of machines. See the [link mpi.homogeneous_machines optimization] notes below.
|
|
|
|
Most MPI implementations requires specific compilation and link options.
|
|
In order to mask theses options to the user, most MPI implementations provides
|
|
wrappers which silently pass those options to the compiler.
|
|
|
|
Depending on your MPI implementation, some work might be needed to tell Boost which
|
|
specific MPI option to use. This is done through the `using mpi ;` directive of the `project-config.jam` file.
|
|
|
|
The general form is the following (do not forget to leave spaces around *:* and before *;*):
|
|
|
|
[pre
|
|
using mpi
|
|
: \[<MPI compiler wrapper>\]
|
|
: \[<compilation and link options>\]
|
|
: \[<mpi runner>\] ;
|
|
]
|
|
|
|
* [* If you're lucky]
|
|
|
|
For those who uses _MPICH_, _OpenMPI_ or some of their derivatives, configuration can be
|
|
almost automatic. In fact, if your `mpicxx` command is in your path, you just need to use:
|
|
|
|
[pre
|
|
using mpi ;
|
|
]
|
|
|
|
The directive will find the wrapper and deduce the options to use.
|
|
|
|
* [*If your wrapper is not in your path]
|
|
|
|
...or if it does not have a usual wrapper name, you will need to tell the build system where to find it:
|
|
|
|
[pre
|
|
using mpi : /opt/mpi/bullxmpi/1.2.8.3/bin/mpicc ;
|
|
]
|
|
|
|
* [*If your wrapper is really eccentric]
|
|
|
|
or does not exist at all (it happens), you need to
|
|
provide the compilation and build options to the build environment using `jam` directives.
|
|
For example, the following could be used for a specific Intel MPI implementation:
|
|
|
|
[pre
|
|
using mpi : mpiicc :
|
|
<library-path>/softs/intel/impi/5.0.1.035/intel64/lib
|
|
<library-path>/softs/intel/impi/5.0.1.035/intel64/lib/release_mt
|
|
<include>/softs/intel/impi/5.0.1.035/intel64/include
|
|
<find-shared-library>mpifort
|
|
<find-shared-library>mpi_mt
|
|
<find-shared-library>mpigi
|
|
<find-shared-library>dl
|
|
<find-shared-library>rt ;
|
|
]
|
|
|
|
To do that, you need to guess the libraries and include directories associated with your environment.
|
|
You can refer to the your specific MPI environment documentation.
|
|
Most of the time though, your wrapper have an option that provide that information, it usually starts with `--show`:
|
|
[pre
|
|
$ mpiicc -show
|
|
icc -I/softs/intel//impi/5.0.3.048/intel64/include -L/softs/intel//impi/5.0.3.048/intel64/lib/release_mt -L/softs/intel//impi/5.0.3.048/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /softs/intel//impi/5.0.3.048/intel64/lib/release_mt -Xlinker -rpath -Xlinker /softs/intel//impi/5.0.3.048/intel64/lib -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Xlinker -rpath -Xlinker /opt/intel/mpi-rt/5.0/intel64/lib -lmpifort -lmpi -lmpigi -ldl -lrt -lpthread
|
|
$
|
|
]
|
|
[
|
|
$ mpicc --showme
|
|
icc -I/opt/mpi/bullxmpi/1.2.8.3/include -pthread -L/opt/mpi/bullxmpi/1.2.8.3/lib -lmpi -ldl -lm -lnuma -Wl,--export-dynamic -lrt -lnsl -lutil -lm -ldl
|
|
$ mpicc --showme:compile
|
|
-I/opt/mpi/bullxmpi/1.2.8.3/include -pthread
|
|
$ mpicc --showme:link
|
|
-pthread -L/opt/mpi/bullxmpi/1.2.8.3/lib -lmpi -ldl -lm -lnuma -Wl,--export-dynamic -lrt -lnsl -lutil -lm -ldl
|
|
$
|
|
]
|
|
|
|
To see the results of MPI auto-detection, pass `--debug-configuration` on
|
|
the bjam command line.
|
|
|
|
* [*If you want to run the regression tests]
|
|
|
|
...Which is a good thing.
|
|
|
|
The (optional) third argument configures Boost.MPI for running
|
|
regression tests. These parameters specify the executable used to
|
|
launch jobs (the default is "mpirun") followed by any necessary arguments
|
|
to this to run tests and tell the program to expect the number of
|
|
processors to follow (default: "-np"). With the default parameters,
|
|
for instance, the test harness will execute, e.g.,
|
|
|
|
[pre
|
|
mpirun -np 4 all_gather_test
|
|
]
|
|
|
|
Some implementations provides alternative launcher that can be more convenient. For example, Intel's MPI provides the `mpiexec.hydra`:
|
|
|
|
[pre
|
|
$mpiexec.hydra -np 4 all_gather_test
|
|
]
|
|
|
|
which does not requires any daemon to be running (as opposed to their `mpirun` command). Such a launcher need to be specified though:
|
|
|
|
[pre
|
|
using mpi : mpiicc :
|
|
.....
|
|
: mpiexec.hydra -n ;
|
|
]
|
|
|
|
[endsect]
|
|
[section:installation Build and Install]
|
|
|
|
To build the whole Boost distribution:
|
|
[pre
|
|
$cd <boost distribution>
|
|
$./b2 install
|
|
]
|
|
|
|
[tip
|
|
Or, if you have a multi-cpu machine (say 24):
|
|
|
|
[pre
|
|
$cd <boost distribution>
|
|
$./b2 -j24 install
|
|
]
|
|
]
|
|
|
|
Installation of Boost.MPI can be performed in the build step by
|
|
specifying `install` on the command line and (optionally) providing an
|
|
installation location, e.g.,
|
|
|
|
[pre
|
|
$./b2 install
|
|
]
|
|
|
|
This command will install libraries into a default system location. To
|
|
change the path where libraries will be installed, add the option
|
|
`--prefix=PATH`.
|
|
|
|
Then, you can run the regression tests with:
|
|
[pre
|
|
$cd <boost distribution/lib/mpi/test
|
|
$../../../b2
|
|
]
|
|
|
|
[endsect]
|
|
[endsect]
|
|
[section:using Using Boost.MPI]
|
|
|
|
To build applications based on Boost.MPI, compile and link them as you
|
|
normally would for MPI programs, but remember to link against the
|
|
`boost_mpi` and `boost_serialization` libraries, e.g.,
|
|
|
|
[pre
|
|
mpic++ -I/path/to/boost/mpi my_application.cpp -Llibdir \
|
|
-lboost_mpi-gcc-mt-1_35 -lboost_serialization-gcc-d-1_35.a
|
|
]
|
|
|
|
If you plan to use the [link mpi.python Python bindings] for
|
|
Boost.MPI in conjunction with the C++ Boost.MPI, you will also need to
|
|
link against the boost_mpi_python library, e.g., by adding
|
|
`-lboost_mpi_python-gcc-mt-1_35` to your link command. This step will
|
|
only be necessary if you intend to [link mpi.python_user_data
|
|
register C++ types] or use the [link
|
|
mpi.python_skeleton_content skeleton/content mechanism] from
|
|
within Python.
|
|
|
|
[endsect]
|
|
|
|
[endsect]
|
|
|
|
[section:tutorial Tutorial]
|
|
|
|
A Boost.MPI program consists of many cooperating processes (possibly
|
|
running on different computers) that communicate among themselves by
|
|
passing messages. Boost.MPI is a library (as is the lower-level MPI),
|
|
not a language, so the first step in a Boost.MPI is to create an
|
|
[classref boost::mpi::environment mpi::environment] object
|
|
that initializes the MPI environment and enables communication among
|
|
the processes. The [classref boost::mpi::environment
|
|
mpi::environment] object is initialized with the program arguments
|
|
(which it may modify) in your main program. The creation of this
|
|
object initializes MPI, and its destruction will finalize MPI. In the
|
|
vast majority of Boost.MPI programs, an instance of [classref
|
|
boost::mpi::environment mpi::environment] will be declared
|
|
in `main` at the very beginning of the program.
|
|
|
|
Communication with MPI always occurs over a *communicator*,
|
|
which can be created be simply default-constructing an object of type
|
|
[classref boost::mpi::communicator mpi::communicator]. This
|
|
communicator can then be queried to determine how many processes are
|
|
running (the "size" of the communicator) and to give a unique number
|
|
to each process, from zero to the size of the communicator (i.e., the
|
|
"rank" of the process):
|
|
|
|
#include <boost/mpi/environment.hpp>
|
|
#include <boost/mpi/communicator.hpp>
|
|
#include <iostream>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
std::cout << "I am process " << world.rank() << " of " << world.size()
|
|
<< "." << std::endl;
|
|
return 0;
|
|
}
|
|
|
|
If you run this program with 7 processes, for instance, you will
|
|
receive output such as:
|
|
|
|
[pre
|
|
I am process 5 of 7.
|
|
I am process 0 of 7.
|
|
I am process 1 of 7.
|
|
I am process 6 of 7.
|
|
I am process 2 of 7.
|
|
I am process 4 of 7.
|
|
I am process 3 of 7.
|
|
]
|
|
|
|
Of course, the processes can execute in a different order each time,
|
|
so the ranks might not be strictly increasing. More interestingly, the
|
|
text could come out completely garbled, because one process can start
|
|
writing "I am a process" before another process has finished writing
|
|
"of 7.".
|
|
|
|
If you should still have an MPI library supporting only MPI 1.1 you
|
|
will need to pass the command line arguments to the environment
|
|
constructor as shown in this example:
|
|
|
|
#include <boost/mpi/environment.hpp>
|
|
#include <boost/mpi/communicator.hpp>
|
|
#include <iostream>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main(int argc, char* argv[])
|
|
{
|
|
mpi::environment env(argc, argv);
|
|
mpi::communicator world;
|
|
std::cout << "I am process " << world.rank() << " of " << world.size()
|
|
<< "." << std::endl;
|
|
return 0;
|
|
}
|
|
|
|
[section:point_to_point Point-to-Point communication]
|
|
|
|
As a message passing library, MPI's primary purpose is to routine
|
|
messages from one process to another, i.e., point-to-point. MPI
|
|
contains routines that can send messages, receive messages, and query
|
|
whether messages are available. Each message has a source process, a
|
|
target process, a tag, and a payload containing arbitrary data. The
|
|
source and target processes are the ranks of the sender and receiver
|
|
of the message, respectively. Tags are integers that allow the
|
|
receiver to distinguish between different messages coming from the
|
|
same sender.
|
|
|
|
The following program uses two MPI processes to write "Hello, world!"
|
|
to the screen (`hello_world.cpp`):
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <string>
|
|
#include <boost/serialization/string.hpp>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
|
|
if (world.rank() == 0) {
|
|
world.send(1, 0, std::string("Hello"));
|
|
std::string msg;
|
|
world.recv(1, 1, msg);
|
|
std::cout << msg << "!" << std::endl;
|
|
} else {
|
|
std::string msg;
|
|
world.recv(0, 0, msg);
|
|
std::cout << msg << ", ";
|
|
std::cout.flush();
|
|
world.send(0, 1, std::string("world"));
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
The first processor (rank 0) passes the message "Hello" to the second
|
|
processor (rank 1) using tag 0. The second processor prints the string
|
|
it receives, along with a comma, then passes the message "world" back
|
|
to processor 0 with a different tag. The first processor then writes
|
|
this message with the "!" and exits. All sends are accomplished with
|
|
the [memberref boost::mpi::communicator::send
|
|
communicator::send] method and all receives use a corresponding
|
|
[memberref boost::mpi::communicator::recv
|
|
communicator::recv] call.
|
|
|
|
[section:nonblocking Non-blocking communication]
|
|
|
|
The default MPI communication operations--`send` and `recv`--may have
|
|
to wait until the entire transmission is completed before they can
|
|
return. Sometimes this *blocking* behavior has a negative impact on
|
|
performance, because the sender could be performing useful computation
|
|
while it is waiting for the transmission to occur. More important,
|
|
however, are the cases where several communication operations must
|
|
occur simultaneously, e.g., a process will both send and receive at
|
|
the same time.
|
|
|
|
Let's revisit our "Hello, world!" program from the previous
|
|
section. The core of this program transmits two messages:
|
|
|
|
if (world.rank() == 0) {
|
|
world.send(1, 0, std::string("Hello"));
|
|
std::string msg;
|
|
world.recv(1, 1, msg);
|
|
std::cout << msg << "!" << std::endl;
|
|
} else {
|
|
std::string msg;
|
|
world.recv(0, 0, msg);
|
|
std::cout << msg << ", ";
|
|
std::cout.flush();
|
|
world.send(0, 1, std::string("world"));
|
|
}
|
|
|
|
The first process passes a message to the second process, then
|
|
prepares to receive a message. The second process does the send and
|
|
receive in the opposite order. However, this sequence of events is
|
|
just that--a *sequence*--meaning that there is essentially no
|
|
parallelism. We can use non-blocking communication to ensure that the
|
|
two messages are transmitted simultaneously
|
|
(`hello_world_nonblocking.cpp`):
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <string>
|
|
#include <boost/serialization/string.hpp>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
|
|
if (world.rank() == 0) {
|
|
mpi::request reqs[2];
|
|
std::string msg, out_msg = "Hello";
|
|
reqs[0] = world.isend(1, 0, out_msg);
|
|
reqs[1] = world.irecv(1, 1, msg);
|
|
mpi::wait_all(reqs, reqs + 2);
|
|
std::cout << msg << "!" << std::endl;
|
|
} else {
|
|
mpi::request reqs[2];
|
|
std::string msg, out_msg = "world";
|
|
reqs[0] = world.isend(0, 1, out_msg);
|
|
reqs[1] = world.irecv(0, 0, msg);
|
|
mpi::wait_all(reqs, reqs + 2);
|
|
std::cout << msg << ", ";
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
We have replaced calls to the [memberref
|
|
boost::mpi::communicator::send communicator::send] and
|
|
[memberref boost::mpi::communicator::recv
|
|
communicator::recv] members with similar calls to their non-blocking
|
|
counterparts, [memberref boost::mpi::communicator::isend
|
|
communicator::isend] and [memberref
|
|
boost::mpi::communicator::irecv communicator::irecv]. The
|
|
prefix *i* indicates that the operations return immediately with a
|
|
[classref boost::mpi::request mpi::request] object, which
|
|
allows one to query the status of a communication request (see the
|
|
[memberref boost::mpi::request::test test] method) or wait
|
|
until it has completed (see the [memberref
|
|
boost::mpi::request::wait wait] method). Multiple requests
|
|
can be completed at the same time with the [funcref
|
|
boost::mpi::wait_all wait_all] operation.
|
|
|
|
Important note: The MPI standard requires users to keep the request
|
|
handle for a non-blocking communication, and to call the "wait"
|
|
operation (or successfully test for completion) to complete the send
|
|
or receive. Unlike most C MPI implementations, which allow the user to
|
|
discard the request for a non-blocking send, Boost.MPI requires the
|
|
user to call "wait" or "test", since the request object might contain
|
|
temporary buffers that have to be kept until the send is
|
|
completed. Moreover, the MPI standard does not guarantee that the
|
|
receive makes any progress before a call to "wait" or "test", although
|
|
most implementations of the C MPI do allow receives to progress before
|
|
the call to "wait" or "test". Boost.MPI, on the other hand, generally
|
|
requires "test" or "wait" calls to make progress.
|
|
|
|
If you run this program multiple times, you may see some strange
|
|
results: namely, some runs will produce:
|
|
|
|
Hello, world!
|
|
|
|
while others will produce:
|
|
|
|
world!
|
|
Hello,
|
|
|
|
or even some garbled version of the letters in "Hello" and
|
|
"world". This indicates that there is some parallelism in the program,
|
|
because after both messages are (simultaneously) transmitted, both
|
|
processes will concurrent execute their print statements. For both
|
|
performance and correctness, non-blocking communication operations are
|
|
critical to many parallel applications using MPI.
|
|
|
|
[endsect]
|
|
|
|
[section:user_data_types User-defined data types]
|
|
|
|
The inclusion of `boost/serialization/string.hpp` in the previous
|
|
examples is very important: it makes values of type `std::string`
|
|
serializable, so that they can be be transmitted using Boost.MPI. In
|
|
general, built-in C++ types (`int`s, `float`s, characters, etc.) can
|
|
be transmitted over MPI directly, while user-defined and
|
|
library-defined types will need to first be serialized (packed) into a
|
|
format that is amenable to transmission. Boost.MPI relies on the
|
|
_Serialization_ library to serialize and deserialize data types.
|
|
|
|
For types defined by the standard library (such as `std::string` or
|
|
`std::vector`) and some types in Boost (such as `boost::variant`), the
|
|
_Serialization_ library already contains all of the required
|
|
serialization code. In these cases, you need only include the
|
|
appropriate header from the `boost/serialization` directory.
|
|
|
|
[def _gps_position_ [link gps_position `gps_position`]]
|
|
For types that do not already have a serialization header, you will
|
|
first need to implement serialization code before the types can be
|
|
transmitted using Boost.MPI. Consider a simple class _gps_position_
|
|
that contains members `degrees`, `minutes`, and `seconds`. This class
|
|
is made serializable by making it a friend of
|
|
`boost::serialization::access` and introducing the templated
|
|
`serialize()` function, as follows:[#gps_position]
|
|
|
|
class gps_position
|
|
{
|
|
private:
|
|
friend class boost::serialization::access;
|
|
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
ar & degrees;
|
|
ar & minutes;
|
|
ar & seconds;
|
|
}
|
|
|
|
int degrees;
|
|
int minutes;
|
|
float seconds;
|
|
public:
|
|
gps_position(){};
|
|
gps_position(int d, int m, float s) :
|
|
degrees(d), minutes(m), seconds(s)
|
|
{}
|
|
};
|
|
|
|
Complete information about making types serializable is beyond the
|
|
scope of this tutorial. For more information, please see the
|
|
_Serialization_ library tutorial from which the above example was
|
|
extracted. One important side benefit of making types serializable for
|
|
Boost.MPI is that they become serializable for any other usage, such
|
|
as storing the objects to disk and manipulated them in XML.
|
|
|
|
|
|
Some serializable types, like _gps_position_ above, have a fixed
|
|
amount of data stored at fixed offsets and are fully defined by
|
|
the values of their data member (most POD with no pointers are a good example).
|
|
When this is the case, Boost.MPI can optimize their serialization and
|
|
transmission by avoiding extraneous copy operations.
|
|
To enable this optimization, users must specialize the type trait [classref
|
|
boost::mpi::is_mpi_datatype `is_mpi_datatype`], e.g.:
|
|
|
|
namespace boost { namespace mpi {
|
|
template <>
|
|
struct is_mpi_datatype<gps_position> : mpl::true_ { };
|
|
} }
|
|
|
|
For non-template types we have defined a macro to simplify declaring a type
|
|
as an MPI datatype
|
|
|
|
BOOST_IS_MPI_DATATYPE(gps_position)
|
|
|
|
For composite traits, the specialization of [classref
|
|
boost::mpi::is_mpi_datatype `is_mpi_datatype`] may depend on
|
|
`is_mpi_datatype` itself. For instance, a `boost::array` object is
|
|
fixed only when the type of the parameter it stores is fixed:
|
|
|
|
namespace boost { namespace mpi {
|
|
template <typename T, std::size_t N>
|
|
struct is_mpi_datatype<array<T, N> >
|
|
: public is_mpi_datatype<T> { };
|
|
} }
|
|
|
|
The redundant copy elimination optimization can only be applied when
|
|
the shape of the data type is completely fixed. Variable-length types
|
|
(e.g., strings, linked lists) and types that store pointers cannot use
|
|
the optimization, but Boost.MPI will be unable to detect this error at
|
|
compile time. Attempting to perform this optimization when it is not
|
|
correct will likely result in segmentation faults and other strange
|
|
program behavior.
|
|
|
|
Boost.MPI can transmit any user-defined data type from one process to
|
|
another. Built-in types can be transmitted without any extra effort;
|
|
library-defined types require the inclusion of a serialization header;
|
|
and user-defined types will require the addition of serialization
|
|
code. Fixed data types can be optimized for transmission using the
|
|
[classref boost::mpi::is_mpi_datatype `is_mpi_datatype`]
|
|
type trait.
|
|
|
|
[endsect]
|
|
[endsect]
|
|
|
|
[section:collectives Collective operations]
|
|
|
|
[link mpi.point_to_point Point-to-point operations] are the
|
|
core message passing primitives in Boost.MPI. However, many
|
|
message-passing applications also require higher-level communication
|
|
algorithms that combine or summarize the data stored on many different
|
|
processes. These algorithms support many common tasks such as
|
|
"broadcast this value to all processes", "compute the sum of the
|
|
values on all processors" or "find the global minimum."
|
|
|
|
[section:broadcast Broadcast]
|
|
The [funcref boost::mpi::broadcast `broadcast`] algorithm is
|
|
by far the simplest collective operation. It broadcasts a value from a
|
|
single process to all other processes within a [classref
|
|
boost::mpi::communicator communicator]. For instance, the
|
|
following program broadcasts "Hello, World!" from process 0 to every
|
|
other process. (`hello_world_broadcast.cpp`)
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <string>
|
|
#include <boost/serialization/string.hpp>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
|
|
std::string value;
|
|
if (world.rank() == 0) {
|
|
value = "Hello, World!";
|
|
}
|
|
|
|
broadcast(world, value, 0);
|
|
|
|
std::cout << "Process #" << world.rank() << " says " << value
|
|
<< std::endl;
|
|
return 0;
|
|
}
|
|
|
|
Running this program with seven processes will produce a result such
|
|
as:
|
|
|
|
[pre
|
|
Process #0 says Hello, World!
|
|
Process #2 says Hello, World!
|
|
Process #1 says Hello, World!
|
|
Process #4 says Hello, World!
|
|
Process #3 says Hello, World!
|
|
Process #5 says Hello, World!
|
|
Process #6 says Hello, World!
|
|
]
|
|
[endsect]
|
|
|
|
[section:gather Gather]
|
|
The [funcref boost::mpi::gather `gather`] collective gathers
|
|
the values produced by every process in a communicator into a vector
|
|
of values on the "root" process (specified by an argument to
|
|
`gather`). The /i/th element in the vector will correspond to the
|
|
value gathered fro mthe /i/th process. For instance, in the following
|
|
program each process computes its own random number. All of these
|
|
random numbers are gathered at process 0 (the "root" in this case),
|
|
which prints out the values that correspond to each processor.
|
|
(`random_gather.cpp`)
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <vector>
|
|
#include <cstdlib>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
|
|
std::srand(time(0) + world.rank());
|
|
int my_number = std::rand();
|
|
if (world.rank() == 0) {
|
|
std::vector<int> all_numbers;
|
|
gather(world, my_number, all_numbers, 0);
|
|
for (int proc = 0; proc < world.size(); ++proc)
|
|
std::cout << "Process #" << proc << " thought of "
|
|
<< all_numbers[proc] << std::endl;
|
|
} else {
|
|
gather(world, my_number, 0);
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
Executing this program with seven processes will result in output such
|
|
as the following. Although the random values will change from one run
|
|
to the next, the order of the processes in the output will remain the
|
|
same because only process 0 writes to `std::cout`.
|
|
|
|
[pre
|
|
Process #0 thought of 332199874
|
|
Process #1 thought of 20145617
|
|
Process #2 thought of 1862420122
|
|
Process #3 thought of 480422940
|
|
Process #4 thought of 1253380219
|
|
Process #5 thought of 949458815
|
|
Process #6 thought of 650073868
|
|
]
|
|
|
|
The `gather` operation collects values from every process into a
|
|
vector at one process. If instead the values from every process need
|
|
to be collected into identical vectors on every process, use the
|
|
[funcref boost::mpi::all_gather `all_gather`] algorithm,
|
|
which is semantically equivalent to calling `gather` followed by a
|
|
`broadcast` of the resulting vector.
|
|
|
|
[endsect]
|
|
|
|
[section:reduce Reduce]
|
|
|
|
The [funcref boost::mpi::reduce `reduce`] collective
|
|
summarizes the values from each process into a single value at the
|
|
user-specified "root" process. The Boost.MPI `reduce` operation is
|
|
similar in spirit to the STL _accumulate_ operation, because it takes
|
|
a sequence of values (one per process) and combines them via a
|
|
function object. For instance, we can randomly generate values in each
|
|
process and the compute the minimum value over all processes via a
|
|
call to [funcref boost::mpi::reduce `reduce`]
|
|
(`random_min.cpp`):
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <cstdlib>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
|
|
std::srand(time(0) + world.rank());
|
|
int my_number = std::rand();
|
|
|
|
if (world.rank() == 0) {
|
|
int minimum;
|
|
reduce(world, my_number, minimum, mpi::minimum<int>(), 0);
|
|
std::cout << "The minimum value is " << minimum << std::endl;
|
|
} else {
|
|
reduce(world, my_number, mpi::minimum<int>(), 0);
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
The use of `mpi::minimum<int>` indicates that the minimum value
|
|
should be computed. `mpi::minimum<int>` is a binary function object
|
|
that compares its two parameters via `<` and returns the smaller
|
|
value. Any associative binary function or function object will
|
|
work. For instance, to concatenate strings with `reduce` one could use
|
|
the function object `std::plus<std::string>` (`string_cat.cpp`):
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <string>
|
|
#include <functional>
|
|
#include <boost/serialization/string.hpp>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
|
|
std::string names[10] = { "zero ", "one ", "two ", "three ",
|
|
"four ", "five ", "six ", "seven ",
|
|
"eight ", "nine " };
|
|
|
|
std::string result;
|
|
reduce(world,
|
|
world.rank() < 10? names[world.rank()]
|
|
: std::string("many "),
|
|
result, std::plus<std::string>(), 0);
|
|
|
|
if (world.rank() == 0)
|
|
std::cout << "The result is " << result << std::endl;
|
|
|
|
return 0;
|
|
}
|
|
|
|
In this example, we compute a string for each process and then perform
|
|
a reduction that concatenates all of the strings together into one,
|
|
long string. Executing this program with seven processors yields the
|
|
following output:
|
|
|
|
[pre
|
|
The result is zero one two three four five six
|
|
]
|
|
|
|
Any kind of binary function objects can be used with `reduce`. For
|
|
instance, and there are many such function objects in the C++ standard
|
|
`<functional>` header and the Boost.MPI header
|
|
`<boost/mpi/operations.hpp>`. Or, you can create your own
|
|
function object. Function objects used with `reduce` must be
|
|
associative, i.e. `f(x, f(y, z))` must be equivalent to `f(f(x, y),
|
|
z)`. If they are also commutative (i..e, `f(x, y) == f(y, x)`),
|
|
Boost.MPI can use a more efficient implementation of `reduce`. To
|
|
state that a function object is commutative, you will need to
|
|
specialize the class [classref boost::mpi::is_commutative
|
|
`is_commutative`]. For instance, we could modify the previous example
|
|
by telling Boost.MPI that string concatenation is commutative:
|
|
|
|
namespace boost { namespace mpi {
|
|
|
|
template<>
|
|
struct is_commutative<std::plus<std::string>, std::string>
|
|
: mpl::true_ { };
|
|
|
|
} } // end namespace boost::mpi
|
|
|
|
By adding this code prior to `main()`, Boost.MPI will assume that
|
|
string concatenation is commutative and employ a different parallel
|
|
algorithm for the `reduce` operation. Using this algorithm, the
|
|
program outputs the following when run with seven processes:
|
|
|
|
[pre
|
|
The result is zero one four five six two three
|
|
]
|
|
|
|
Note how the numbers in the resulting string are in a different order:
|
|
this is a direct result of Boost.MPI reordering operations. The result
|
|
in this case differed from the non-commutative result because string
|
|
concatenation is not commutative: `f("x", "y")` is not the same as
|
|
`f("y", "x")`, because argument order matters. For truly commutative
|
|
operations (e.g., integer addition), the more efficient commutative
|
|
algorithm will produce the same result as the non-commutative
|
|
algorithm. Boost.MPI also performs direct mappings from function
|
|
objects in `<functional>` to `MPI_Op` values predefined by MPI (e.g.,
|
|
`MPI_SUM`, `MPI_MAX`); if you have your own function objects that can
|
|
take advantage of this mapping, see the class template [classref
|
|
boost::mpi::is_mpi_op `is_mpi_op`].
|
|
|
|
Like [link mpi.gather `gather`], `reduce` has an "all"
|
|
variant called [funcref boost::mpi::all_reduce `all_reduce`]
|
|
that performs the reduction operation and broadcasts the result to all
|
|
processes. This variant is useful, for instance, in establishing
|
|
global minimum or maximum values.
|
|
|
|
The following code (`global_min.cpp`) shows a broadcasting version of
|
|
the `random_min.cpp` example:
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <cstdlib>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main(int argc, char* argv[])
|
|
{
|
|
mpi::environment env(argc, argv);
|
|
mpi::communicator world;
|
|
|
|
std::srand(world.rank());
|
|
int my_number = std::rand();
|
|
int minimum;
|
|
|
|
all_reduce(world, my_number, minimum, mpi::minimum<int>());
|
|
|
|
if (world.rank() == 0) {
|
|
std::cout << "The minimum value is " << minimum << std::endl;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
In that example we provide both input and output values, requiring
|
|
twice as much space, which can be a problem depending on the size
|
|
of the transmitted data.
|
|
If there is no need to preserve the input value, the output value
|
|
can be omitted. In that case the input value will be overridden with
|
|
the output value and Boost.MPI is able, in some situation, to implement
|
|
the operation with a more space efficient solution (using the `MPI_IN_PLACE`
|
|
flag of the MPI C mapping), as in the following example (`in_place_global_min.cpp`):
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <cstdlib>
|
|
namespace mpi = boost::mpi;
|
|
|
|
int main(int argc, char* argv[])
|
|
{
|
|
mpi::environment env(argc, argv);
|
|
mpi::communicator world;
|
|
|
|
std::srand(world.rank());
|
|
int my_number = std::rand();
|
|
|
|
all_reduce(world, my_number, mpi::minimum<int>());
|
|
|
|
if (world.rank() == 0) {
|
|
std::cout << "The minimum value is " << my_number << std::endl;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
|
|
[endsect]
|
|
|
|
[endsect]
|
|
|
|
[section:communicators Managing communicators]
|
|
|
|
Communication with Boost.MPI always occurs over a communicator. A
|
|
communicator contains a set of processes that can send messages among
|
|
themselves and perform collective operations. There can be many
|
|
communicators within a single program, each of which contains its own
|
|
isolated communication space that acts independently of the other
|
|
communicators.
|
|
|
|
When the MPI environment is initialized, only the "world" communicator
|
|
(called `MPI_COMM_WORLD` in the MPI C and Fortran bindings) is
|
|
available. The "world" communicator, accessed by default-constructing
|
|
a [classref boost::mpi::communicator mpi::communicator]
|
|
object, contains all of the MPI processes present when the program
|
|
begins execution. Other communicators can then be constructed by
|
|
duplicating or building subsets of the "world" communicator. For
|
|
instance, in the following program we split the processes into two
|
|
groups: one for processes generating data and the other for processes
|
|
that will collect the data. (`generate_collect.cpp`)
|
|
|
|
#include <boost/mpi.hpp>
|
|
#include <iostream>
|
|
#include <cstdlib>
|
|
#include <boost/serialization/vector.hpp>
|
|
namespace mpi = boost::mpi;
|
|
|
|
enum message_tags {msg_data_packet, msg_broadcast_data, msg_finished};
|
|
|
|
void generate_data(mpi::communicator local, mpi::communicator world);
|
|
void collect_data(mpi::communicator local, mpi::communicator world);
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env;
|
|
mpi::communicator world;
|
|
|
|
bool is_generator = world.rank() < 2 * world.size() / 3;
|
|
mpi::communicator local = world.split(is_generator? 0 : 1);
|
|
if (is_generator) generate_data(local, world);
|
|
else collect_data(local, world);
|
|
|
|
return 0;
|
|
}
|
|
|
|
When communicators are split in this way, their processes retain
|
|
membership in both the original communicator (which is not altered by
|
|
the split) and the new communicator. However, the ranks of the
|
|
processes may be different from one communicator to the next, because
|
|
the rank values within a communicator are always contiguous values
|
|
starting at zero. In the example above, the first two thirds of the
|
|
processes become "generators" and the remaining processes become
|
|
"collectors". The ranks of the "collectors" in the `world`
|
|
communicator will be 2/3 `world.size()` and greater, whereas the ranks
|
|
of the same collector processes in the `local` communicator will start
|
|
at zero. The following excerpt from `collect_data()` (in
|
|
`generate_collect.cpp`) illustrates how to manage multiple
|
|
communicators:
|
|
|
|
mpi::status msg = world.probe();
|
|
if (msg.tag() == msg_data_packet) {
|
|
// Receive the packet of data
|
|
std::vector<int> data;
|
|
world.recv(msg.source(), msg.tag(), data);
|
|
|
|
// Tell each of the collectors that we'll be broadcasting some data
|
|
for (int dest = 1; dest < local.size(); ++dest)
|
|
local.send(dest, msg_broadcast_data, msg.source());
|
|
|
|
// Broadcast the actual data.
|
|
broadcast(local, data, 0);
|
|
}
|
|
|
|
The code in this except is executed by the "master" collector, e.g.,
|
|
the node with rank 2/3 `world.size()` in the `world` communicator and
|
|
rank 0 in the `local` (collector) communicator. It receives a message
|
|
from a generator via the `world` communicator, then broadcasts the
|
|
message to each of the collectors via the `local` communicator.
|
|
|
|
For more control in the creation of communicators for subgroups of
|
|
processes, the Boost.MPI [classref boost::mpi::group `group`] provides
|
|
facilities to compute the union (`|`), intersection (`&`), and
|
|
difference (`-`) of two groups, generate arbitrary subgroups, etc.
|
|
|
|
[endsect]
|
|
|
|
[section:skeleton_and_content Separating structure from content]
|
|
|
|
When communicating data types over MPI that are not fundamental to MPI
|
|
(such as strings, lists, and user-defined data types), Boost.MPI must
|
|
first serialize these data types into a buffer and then communicate
|
|
them; the receiver then copies the results into a buffer before
|
|
deserializing into an object on the other end. For some data types,
|
|
this overhead can be eliminated by using [classref
|
|
boost::mpi::is_mpi_datatype `is_mpi_datatype`]. However,
|
|
variable-length data types such as strings and lists cannot be MPI
|
|
data types.
|
|
|
|
Boost.MPI supports a second technique for improving performance by
|
|
separating the structure of these variable-length data structures from
|
|
the content stored in the data structures. This feature is only
|
|
beneficial when the shape of the data structure remains the same but
|
|
the content of the data structure will need to be communicated several
|
|
times. For instance, in a finite element analysis the structure of the
|
|
mesh may be fixed at the beginning of computation but the various
|
|
variables on the cells of the mesh (temperature, stress, etc.) will be
|
|
communicated many times within the iterative analysis process. In this
|
|
case, Boost.MPI allows one to first send the "skeleton" of the mesh
|
|
once, then transmit the "content" multiple times. Since the content
|
|
need not contain any information about the structure of the data type,
|
|
it can be transmitted without creating separate communication buffers.
|
|
|
|
To illustrate the use of skeletons and content, we will take a
|
|
somewhat more limited example wherein a master process generates
|
|
random number sequences into a list and transmits them to several
|
|
slave processes. The length of the list will be fixed at program
|
|
startup, so the content of the list (i.e., the current sequence of
|
|
numbers) can be transmitted efficiently. The complete example is
|
|
available in `example/random_content.cpp`. We being with the master
|
|
process (rank 0), which builds a list, communicates its structure via
|
|
a [funcref boost::mpi::skeleton `skeleton`], then repeatedly
|
|
generates random number sequences to be broadcast to the slave
|
|
processes via [classref boost::mpi::content `content`]:
|
|
|
|
|
|
// Generate the list and broadcast its structure
|
|
std::list<int> l(list_len);
|
|
broadcast(world, mpi::skeleton(l), 0);
|
|
|
|
// Generate content several times and broadcast out that content
|
|
mpi::content c = mpi::get_content(l);
|
|
for (int i = 0; i < iterations; ++i) {
|
|
// Generate new random values
|
|
std::generate(l.begin(), l.end(), &random);
|
|
|
|
// Broadcast the new content of l
|
|
broadcast(world, c, 0);
|
|
}
|
|
|
|
// Notify the slaves that we're done by sending all zeroes
|
|
std::fill(l.begin(), l.end(), 0);
|
|
broadcast(world, c, 0);
|
|
|
|
|
|
The slave processes have a very similar structure to the master. They
|
|
receive (via the [funcref boost::mpi::broadcast
|
|
`broadcast()`] call) the skeleton of the data structure, then use it
|
|
to build their own lists of integers. In each iteration, they receive
|
|
via another `broadcast()` the new content in the data structure and
|
|
compute some property of the data:
|
|
|
|
|
|
// Receive the content and build up our own list
|
|
std::list<int> l;
|
|
broadcast(world, mpi::skeleton(l), 0);
|
|
|
|
mpi::content c = mpi::get_content(l);
|
|
int i = 0;
|
|
do {
|
|
broadcast(world, c, 0);
|
|
|
|
if (std::find_if
|
|
(l.begin(), l.end(),
|
|
std::bind1st(std::not_equal_to<int>(), 0)) == l.end())
|
|
break;
|
|
|
|
// Compute some property of the data.
|
|
|
|
++i;
|
|
} while (true);
|
|
|
|
|
|
The skeletons and content of any Serializable data type can be
|
|
transmitted either via the [memberref
|
|
boost::mpi::communicator::send `send`] and [memberref
|
|
boost::mpi::communicator::recv `recv`] members of the
|
|
[classref boost::mpi::communicator `communicator`] class
|
|
(for point-to-point communicators) or broadcast via the [funcref
|
|
boost::mpi::broadcast `broadcast()`] collective. When
|
|
separating a data structure into a skeleton and content, be careful
|
|
not to modify the data structure (either on the sender side or the
|
|
receiver side) without transmitting the skeleton again. Boost.MPI can
|
|
not detect these accidental modifications to the data structure, which
|
|
will likely result in incorrect data being transmitted or unstable
|
|
programs.
|
|
|
|
[endsect]
|
|
|
|
[section:performance_optimizations Performance optimizations]
|
|
[section:serialization_optimizations Serialization optimizations]
|
|
|
|
To obtain optimal performance for small fixed-length data types not containing
|
|
any pointers it is very important to mark them using the type traits of
|
|
Boost.MPI and Boost.Serialization.
|
|
|
|
It was already discussed that fixed length types containing no pointers can be
|
|
using as [classref
|
|
boost::mpi::is_mpi_datatype `is_mpi_datatype`], e.g.:
|
|
|
|
namespace boost { namespace mpi {
|
|
template <>
|
|
struct is_mpi_datatype<gps_position> : mpl::true_ { };
|
|
} }
|
|
|
|
or the equivalent macro
|
|
|
|
BOOST_IS_MPI_DATATYPE(gps_position)
|
|
|
|
In addition it can give a substantial performance gain to turn off tracking
|
|
and versioning for these types, if no pointers to these types are used, by
|
|
using the traits classes or helper macros of Boost.Serialization:
|
|
|
|
BOOST_CLASS_TRACKING(gps_position,track_never)
|
|
BOOST_CLASS_IMPLEMENTATION(gps_position,object_serializable)
|
|
|
|
[endsect]
|
|
|
|
[section:homogeneous_machines Homogeneous Machines]
|
|
|
|
More optimizations are possible on homogeneous machines, by avoiding
|
|
MPI_Pack/MPI_Unpack calls but using direct bitwise copy. This feature is
|
|
enabled by default by defining the macro [macroref BOOST_MPI_HOMOGENEOUS] in the include
|
|
file `boost/mpi/config.hpp`.
|
|
That definition must be consistent when building Boost.MPI and
|
|
when building the application.
|
|
|
|
In addition all classes need to be marked both as is_mpi_datatype and
|
|
as is_bitwise_serializable, by using the helper macro of Boost.Serialization:
|
|
|
|
BOOST_IS_BITWISE_SERIALIZABLE(gps_position)
|
|
|
|
Usually it is safe to serialize a class for which is_mpi_datatype is true
|
|
by using binary copy of the bits. The exception are classes for which
|
|
some members should be skipped for serialization.
|
|
|
|
[endsect]
|
|
[endsect]
|
|
|
|
|
|
[section:c_mapping Mapping from C MPI to Boost.MPI]
|
|
|
|
This section provides tables that map from the functions and constants
|
|
of the standard C MPI to their Boost.MPI equivalents. It will be most
|
|
useful for users that are already familiar with the C or Fortran
|
|
interfaces to MPI, or for porting existing parallel programs to Boost.MPI.
|
|
|
|
[table Point-to-point communication
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_ANY_SOURCE`] [`any_source`]]
|
|
|
|
[[`MPI_ANY_TAG`] [`any_tag`]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node40.html#Node40
|
|
`MPI_Bsend`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node51.html#Node51
|
|
`MPI_Bsend_init`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node42.html#Node42
|
|
`MPI_Buffer_attach`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node42.html#Node42
|
|
`MPI_Buffer_detach`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node50.html#Node50
|
|
`MPI_Cancel`]]
|
|
[[memberref boost::mpi::request::cancel
|
|
`request::cancel`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node35.html#Node35
|
|
`MPI_Get_count`]]
|
|
[[memberref boost::mpi::status::count `status::count`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node46.html#Node46
|
|
`MPI_Ibsend`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node50.html#Node50
|
|
`MPI_Iprobe`]]
|
|
[[memberref boost::mpi::communicator::iprobe `communicator::iprobe`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node46.html#Node46
|
|
`MPI_Irsend`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node46.html#Node46
|
|
`MPI_Isend`]]
|
|
[[memberref boost::mpi::communicator::isend
|
|
`communicator::isend`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node46.html#Node46
|
|
`MPI_Issend`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node46.html#Node46
|
|
`MPI_Irecv`]]
|
|
[[memberref boost::mpi::communicator::isend
|
|
`communicator::irecv`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node50.html#Node50
|
|
`MPI_Probe`]]
|
|
[[memberref boost::mpi::communicator::probe `communicator::probe`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node53.html#Node53
|
|
`MPI_PROC_NULL`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node34.html#Node34 `MPI_Recv`]]
|
|
[[memberref boost::mpi::communicator::recv
|
|
`communicator::recv`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node51.html#Node51
|
|
`MPI_Recv_init`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Request_free`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node40.html#Node40
|
|
`MPI_Rsend`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node51.html#Node51
|
|
`MPI_Rsend_init`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node31.html#Node31
|
|
`MPI_Send`]]
|
|
[[memberref boost::mpi::communicator::send
|
|
`communicator::send`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node52.html#Node52
|
|
`MPI_Sendrecv`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node52.html#Node52
|
|
`MPI_Sendrecv_replace`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node51.html#Node51
|
|
`MPI_Send_init`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node40.html#Node40
|
|
`MPI_Ssend`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node51.html#Node51
|
|
`MPI_Ssend_init`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node51.html#Node51
|
|
`MPI_Start`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node51.html#Node51
|
|
`MPI_Startall`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Test`]] [[memberref boost::mpi::request::wait `request::test`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Testall`]] [[funcref boost::mpi::test_all `test_all`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Testany`]] [[funcref boost::mpi::test_any `test_any`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Testsome`]] [[funcref boost::mpi::test_some `test_some`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node50.html#Node50
|
|
`MPI_Test_cancelled`]]
|
|
[[memberref boost::mpi::status::cancelled
|
|
`status::cancelled`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Wait`]] [[memberref boost::mpi::request::wait
|
|
`request::wait`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Waitall`]] [[funcref boost::mpi::wait_all `wait_all`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Waitany`]] [[funcref boost::mpi::wait_any `wait_any`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node47.html#Node47
|
|
`MPI_Waitsome`]] [[funcref boost::mpi::wait_some `wait_some`]]]
|
|
]
|
|
|
|
Boost.MPI automatically maps C and C++ data types to their MPI
|
|
equivalents. The following table illustrates the mappings between C++
|
|
types and MPI datatype constants.
|
|
|
|
[table Datatypes
|
|
[[C Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_CHAR`] [`signed char`]]
|
|
[[`MPI_SHORT`] [`signed short int`]]
|
|
[[`MPI_INT`] [`signed int`]]
|
|
[[`MPI_LONG`] [`signed long int`]]
|
|
[[`MPI_UNSIGNED_CHAR`] [`unsigned char`]]
|
|
[[`MPI_UNSIGNED_SHORT`] [`unsigned short int`]]
|
|
[[`MPI_UNSIGNED_INT`] [`unsigned int`]]
|
|
[[`MPI_UNSIGNED_LONG`] [`unsigned long int`]]
|
|
[[`MPI_FLOAT`] [`float`]]
|
|
[[`MPI_DOUBLE`] [`double`]]
|
|
[[`MPI_LONG_DOUBLE`] [`long double`]]
|
|
[[`MPI_BYTE`] [unused]]
|
|
[[`MPI_PACKED`] [used internally for [link
|
|
mpi.user_data_types serialized data types]]]
|
|
[[`MPI_LONG_LONG_INT`] [`long long int`, if supported by compiler]]
|
|
[[`MPI_UNSIGNED_LONG_LONG_INT`] [`unsigned long long int`, if
|
|
supported by compiler]]
|
|
[[`MPI_FLOAT_INT`] [`std::pair<float, int>`]]
|
|
[[`MPI_DOUBLE_INT`] [`std::pair<double, int>`]]
|
|
[[`MPI_LONG_INT`] [`std::pair<long, int>`]]
|
|
[[`MPI_2INT`] [`std::pair<int, int>`]]
|
|
[[`MPI_SHORT_INT`] [`std::pair<short, int>`]]
|
|
[[`MPI_LONG_DOUBLE_INT`] [`std::pair<long double, int>`]]
|
|
]
|
|
|
|
Boost.MPI does not provide direct wrappers to the MPI derived
|
|
datatypes functionality. Instead, Boost.MPI relies on the
|
|
_Serialization_ library to construct MPI datatypes for user-defined
|
|
classes. The section on [link mpi.user_data_types user-defined
|
|
data types] describes this mechanism, which is used for types that
|
|
marked as "MPI datatypes" using [classref
|
|
boost::mpi::is_mpi_datatype `is_mpi_datatype`].
|
|
|
|
The derived datatypes table that follows describes which C++ types
|
|
correspond to the functionality of the C MPI's datatype
|
|
constructor. Boost.MPI may not actually use the C MPI function listed
|
|
when building datatypes of a certain form. Since the actual datatypes
|
|
built by Boost.MPI are typically hidden from the user, many of these
|
|
operations are called internally by Boost.MPI.
|
|
|
|
[table Derived datatypes
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node56.html#Node56
|
|
`MPI_Address`]] [used automatically in Boost.MPI for MPI version 1.x]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-20-html/node76.htm#Node76
|
|
`MPI_Get_address`]] [used automatically in Boost.MPI for MPI version 2.0 and higher]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node58.html#Node58
|
|
`MPI_Type_commit`]] [used automatically in Boost.MPI]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node55.html#Node55
|
|
`MPI_Type_contiguous`]] [arrays]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node56.html#Node56
|
|
`MPI_Type_extent`]] [used automatically in Boost.MPI]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node58.html#Node58
|
|
`MPI_Type_free`]] [used automatically in Boost.MPI]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node55.html#Node55
|
|
`MPI_Type_hindexed`]] [any type used as a subobject]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node55.html#Node55
|
|
`MPI_Type_hvector`]] [unused]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node55.html#Node55
|
|
`MPI_Type_indexed`]] [any type used as a subobject]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node57.html#Node57
|
|
`MPI_Type_lb`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node56.html#Node56
|
|
`MPI_Type_size`]] [used automatically in Boost.MPI]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node55.html#Node55
|
|
`MPI_Type_struct`]] [user-defined classes and structs with MPI 1.x]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-20-html/node76.htm#Node76
|
|
`MPI_Type_create_struct`]] [user-defined classes and structs with MPI 2.0 and higher]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node57.html#Node57
|
|
`MPI_Type_ub`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node55.html#Node55
|
|
`MPI_Type_vector`]] [used automatically in Boost.MPI]]
|
|
]
|
|
|
|
MPI's packing facilities store values into a contiguous buffer, which
|
|
can later be transmitted via MPI and unpacked into separate values via
|
|
MPI's unpacking facilities. As with datatypes, Boost.MPI provides an
|
|
abstract interface to MPI's packing and unpacking facilities. In
|
|
particular, the two archive classes [classref
|
|
boost::mpi::packed_oarchive `packed_oarchive`] and [classref
|
|
boost::mpi::packed_iarchive `packed_iarchive`] can be used
|
|
to pack or unpack a contiguous buffer using MPI's facilities.
|
|
|
|
[table Packing and unpacking
|
|
[[C Function] [Boost.MPI Equivalent]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node62.html#Node62
|
|
`MPI_Pack`]] [[classref
|
|
boost::mpi::packed_oarchive `packed_oarchive`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node62.html#Node62
|
|
`MPI_Pack_size`]] [used internally by Boost.MPI]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node62.html#Node62
|
|
`MPI_Unpack`]] [[classref
|
|
boost::mpi::packed_iarchive `packed_iarchive`]]]
|
|
]
|
|
|
|
Boost.MPI supports a one-to-one mapping for most of the MPI
|
|
collectives. For each collective provided by Boost.MPI, the underlying
|
|
C MPI collective will be invoked when it is possible (and efficient)
|
|
to do so.
|
|
|
|
[table Collectives
|
|
[[C Function] [Boost.MPI Equivalent]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node73.html#Node73
|
|
`MPI_Allgather`]] [[funcref boost::mpi::all_gather `all_gather`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node73.html#Node73
|
|
`MPI_Allgatherv`]] [most uses supported by [funcref boost::mpi::all_gather `all_gather`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node82.html#Node82
|
|
`MPI_Allreduce`]] [[funcref boost::mpi::all_reduce `all_reduce`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node75.html#Node75
|
|
`MPI_Alltoall`]] [[funcref boost::mpi::all_to_all `all_to_all`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node75.html#Node75
|
|
`MPI_Alltoallv`]] [most uses supported by [funcref boost::mpi::all_to_all `all_to_all`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node66.html#Node66
|
|
`MPI_Barrier`]] [[memberref
|
|
boost::mpi::communicator::barrier `communicator::barrier`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node67.html#Node67
|
|
`MPI_Bcast`]] [[funcref boost::mpi::broadcast `broadcast`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node69.html#Node69
|
|
`MPI_Gather`]] [[funcref boost::mpi::gather `gather`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node69.html#Node69
|
|
`MPI_Gatherv`]] [most uses supported by [funcref boost::mpi::gather `gather`],
|
|
other usages supported by [funcref boost::mpi::gatherv `gatherv`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node77.html#Node77
|
|
`MPI_Reduce`]] [[funcref boost::mpi::reduce `reduce`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node83.html#Node83
|
|
`MPI_Reduce_scatter`]] [unsupported]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node84.html#Node84
|
|
`MPI_Scan`]] [[funcref boost::mpi::scan `scan`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node71.html#Node71
|
|
`MPI_Scatter`]] [[funcref boost::mpi::scatter `scatter`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node71.html#Node71
|
|
`MPI_Scatterv`]] [most uses supported by [funcref boost::mpi::scatter `scatter`],
|
|
other uses supported by [funcref boost::mpi::scatterv `scatterv`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-20-html/node145.htm#Node145
|
|
`MPI_IN_PLACE`]] [supported implicitly by [funcref boost::mpi::all_reduce
|
|
`all_reduce` by omitting the output value]]]
|
|
]
|
|
|
|
Boost.MPI uses function objects to specify how reductions should occur
|
|
in its equivalents to `MPI_Allreduce`, `MPI_Reduce`, and
|
|
`MPI_Scan`. The following table illustrates how
|
|
[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node78.html#Node78
|
|
predefined] and
|
|
[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node80.html#Node80
|
|
user-defined] reduction operations can be mapped between the C MPI and
|
|
Boost.MPI.
|
|
|
|
[table Reduction operations
|
|
[[C Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_BAND`] [[classref boost::mpi::bitwise_and `bitwise_and`]]]
|
|
[[`MPI_BOR`] [[classref boost::mpi::bitwise_or `bitwise_or`]]]
|
|
[[`MPI_BXOR`] [[classref boost::mpi::bitwise_xor `bitwise_xor`]]]
|
|
[[`MPI_LAND`] [`std::logical_and`]]
|
|
[[`MPI_LOR`] [`std::logical_or`]]
|
|
[[`MPI_LXOR`] [[classref boost::mpi::logical_xor `logical_xor`]]]
|
|
[[`MPI_MAX`] [[classref boost::mpi::maximum `maximum`]]]
|
|
[[`MPI_MAXLOC`] [unsupported]]
|
|
[[`MPI_MIN`] [[classref boost::mpi::minimum `minimum`]]]
|
|
[[`MPI_MINLOC`] [unsupported]]
|
|
[[`MPI_Op_create`] [used internally by Boost.MPI]]
|
|
[[`MPI_Op_free`] [used internally by Boost.MPI]]
|
|
[[`MPI_PROD`] [`std::multiplies`]]
|
|
[[`MPI_SUM`] [`std::plus`]]
|
|
]
|
|
|
|
MPI defines several special communicators, including `MPI_COMM_WORLD`
|
|
(including all processes that the local process can communicate with),
|
|
`MPI_COMM_SELF` (including only the local process), and
|
|
`MPI_COMM_EMPTY` (including no processes). These special communicators
|
|
are all instances of the [classref boost::mpi::communicator
|
|
`communicator`] class in Boost.MPI.
|
|
|
|
[table Predefined communicators
|
|
[[C Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_COMM_WORLD`] [a default-constructed [classref boost::mpi::communicator `communicator`]]]
|
|
[[`MPI_COMM_SELF`] [a [classref boost::mpi::communicator `communicator`] that contains only the current process]]
|
|
[[`MPI_COMM_EMPTY`] [a [classref boost::mpi::communicator `communicator`] that evaluates false]]
|
|
]
|
|
|
|
Boost.MPI supports groups of processes through its [classref
|
|
boost::mpi::group `group`] class.
|
|
|
|
[table Group operations and constants
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_GROUP_EMPTY`] [a default-constructed [classref
|
|
boost::mpi::group `group`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node97.html#Node97
|
|
`MPI_Group_size`]] [[memberref boost::mpi::group::size `group::size`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node97.html#Node97
|
|
`MPI_Group_rank`]] [memberref boost::mpi::group::rank `group::rank`]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node97.html#Node97
|
|
`MPI_Group_translate_ranks`]] [memberref boost::mpi::group::translate_ranks `group::translate_ranks`]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node97.html#Node97
|
|
`MPI_Group_compare`]] [operators `==` and `!=`]]
|
|
[[`MPI_IDENT`] [operators `==` and `!=`]]
|
|
[[`MPI_SIMILAR`] [operators `==` and `!=`]]
|
|
[[`MPI_UNEQUAL`] [operators `==` and `!=`]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Comm_group`]] [[memberref
|
|
boost::mpi::communicator::group `communicator::group`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Group_union`]] [operator `|` for groups]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Group_intersection`]] [operator `&` for groups]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Group_difference`]] [operator `-` for groups]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Group_incl`]] [[memberref boost::mpi::group::include `group::include`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Group_excl`]] [[memberref boost::mpi::group::include `group::exclude`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Group_range_incl`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node98.html#Node98
|
|
`MPI_Group_range_excl`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node99.html#Node99
|
|
`MPI_Group_free`]] [used automatically in Boost.MPI]]
|
|
]
|
|
|
|
Boost.MPI provides manipulation of communicators through the [classref
|
|
boost::mpi::communicator `communicator`] class.
|
|
|
|
[table Communicator operations
|
|
[[C Function] [Boost.MPI Equivalent]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node101.html#Node101
|
|
`MPI_Comm_size`]] [[memberref boost::mpi::communicator::size `communicator::size`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node101.html#Node101
|
|
`MPI_Comm_rank`]] [[memberref boost::mpi::communicator::rank
|
|
`communicator::rank`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node101.html#Node101
|
|
`MPI_Comm_compare`]] [operators `==` and `!=`]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node102.html#Node102
|
|
`MPI_Comm_dup`]] [[classref boost::mpi::communicator `communicator`]
|
|
class constructor using `comm_duplicate`]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node102.html#Node102
|
|
`MPI_Comm_create`]] [[classref boost::mpi::communicator
|
|
`communicator`] constructor]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node102.html#Node102
|
|
`MPI_Comm_split`]] [[memberref boost::mpi::communicator::split
|
|
`communicator::split`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node103.html#Node103
|
|
`MPI_Comm_free`]] [used automatically in Boost.MPI]]
|
|
]
|
|
|
|
Boost.MPI currently provides support for inter-communicators via the
|
|
[classref boost::mpi::intercommunicator `intercommunicator`] class.
|
|
|
|
[table Inter-communicator operations
|
|
[[C Function] [Boost.MPI Equivalent]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node112.html#Node112
|
|
`MPI_Comm_test_inter`]] [use [memberref boost::mpi::communicator::as_intercommunicator `communicator::as_intercommunicator`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node112.html#Node112
|
|
`MPI_Comm_remote_size`]] [[memberref boost::mpi::intercommunicator::remote_size] `intercommunicator::remote_size`]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node112.html#Node112
|
|
`MPI_Comm_remote_group`]] [[memberref boost::mpi::intercommunicator::remote_group `intercommunicator::remote_group`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node113.html#Node113
|
|
`MPI_Intercomm_create`]] [[classref boost::mpi::intercommunicator `intercommunicator`] constructor]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node113.html#Node113
|
|
`MPI_Intercomm_merge`]] [[memberref boost::mpi::intercommunicator::merge `intercommunicator::merge`]]]
|
|
]
|
|
|
|
Boost.MPI currently provides no support for attribute caching.
|
|
|
|
[table Attributes and caching
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_NULL_COPY_FN`] [unsupported]]
|
|
[[`MPI_NULL_DELETE_FN`] [unsupported]]
|
|
[[`MPI_KEYVAL_INVALID`] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node119.html#Node119
|
|
`MPI_Keyval_create`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node119.html#Node119
|
|
`MPI_Copy_function`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node119.html#Node119
|
|
`MPI_Delete_function`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node119.html#Node119
|
|
`MPI_Keyval_free`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node119.html#Node119
|
|
`MPI_Attr_put`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node119.html#Node119
|
|
`MPI_Attr_get`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node119.html#Node119
|
|
`MPI_Attr_delete`]] [unsupported]]
|
|
]
|
|
|
|
Boost.MPI will provide complete support for creating communicators
|
|
with different topologies and later querying those topologies. Support
|
|
for graph topologies is provided via an interface to the
|
|
[@http://www.boost.org/libs/graph/doc/index.html Boost Graph Library
|
|
(BGL)], where a communicator can be created which matches the
|
|
structure of any BGL graph, and the graph topology of a communicator
|
|
can be viewed as a BGL graph for use in existing, generic graph
|
|
algorithms.
|
|
|
|
[table Process topologies
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_GRAPH`] [unnecessary; use [memberref boost::mpi::communicator::as_graph_communicator `communicator::as_graph_communicator`]]]
|
|
[[`MPI_CART`] [unnecessary; use [memberref boost::mpi::communicator::has_cartesian_topology `communicator::has_cartesian_topology`]]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node133.html#Node133
|
|
`MPI_Cart_create`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node134.html#Node134
|
|
`MPI_Dims_create`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node135.html#Node135
|
|
`MPI_Graph_create`]] [[classref
|
|
boost::mpi::graph_communicator
|
|
`graph_communicator ctors`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Topo_test`]] [[memberref
|
|
boost::mpi::communicator::as_graph_communicator
|
|
`communicator::as_graph_communicator`], [memberref
|
|
boost::mpi::communicator::has_cartesian_topology
|
|
`communicator::has_cartesian_topology`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Graphdims_get`]] [[funcref boost::mpi::num_vertices
|
|
`num_vertices`], [funcref boost::mpi::num_edges `num_edges`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Graph_get`]] [[funcref boost::mpi::vertices
|
|
`vertices`], [funcref boost::mpi::edges `edges`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Cartdim_get`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Cart_get`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Cart_rank`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Cart_coords`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Graph_neighbors_count`]] [[funcref boost::mpi::out_degree
|
|
`out_degree`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node136.html#Node136
|
|
`MPI_Graph_neighbors`]] [[funcref boost::mpi::out_edges
|
|
`out_edges`], [funcref boost::mpi::adjacent_vertices `adjacent_vertices`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node137.html#Node137
|
|
`MPI_Cart_shift`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node138.html#Node138
|
|
`MPI_Cart_sub`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node139.html#Node139
|
|
`MPI_Cart_map`]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node139.html#Node139
|
|
`MPI_Graph_map`]] [unsupported]]
|
|
]
|
|
|
|
Boost.MPI supports environmental inquires through the [classref
|
|
boost::mpi::environment `environment`] class.
|
|
|
|
[table Environmental inquiries
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_TAG_UB`] [unnecessary; use [memberref
|
|
boost::mpi::environment::max_tag `environment::max_tag`]]]
|
|
[[`MPI_HOST`] [unnecessary; use [memberref
|
|
boost::mpi::environment::host_rank `environment::host_rank`]]]
|
|
[[`MPI_IO`] [unnecessary; use [memberref
|
|
boost::mpi::environment::io_rank `environment::io_rank`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node143.html#Node147
|
|
`MPI_Get_processor_name`]]
|
|
[[memberref boost::mpi::environment::processor_name
|
|
`environment::processor_name`]]]
|
|
]
|
|
|
|
Boost.MPI translates MPI errors into exceptions, reported via the
|
|
[classref boost::mpi::exception `exception`] class.
|
|
|
|
[table Error handling
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_ERRORS_ARE_FATAL`] [unused; errors are translated into
|
|
Boost.MPI exceptions]]
|
|
[[`MPI_ERRORS_RETURN`] [unused; errors are translated into
|
|
Boost.MPI exceptions]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node148.html#Node148
|
|
`MPI_errhandler_create`]] [unused; errors are translated into
|
|
Boost.MPI exceptions]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node148.html#Node148
|
|
`MPI_errhandler_set`]] [unused; errors are translated into
|
|
Boost.MPI exceptions]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node148.html#Node148
|
|
`MPI_errhandler_get`]] [unused; errors are translated into
|
|
Boost.MPI exceptions]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node148.html#Node148
|
|
`MPI_errhandler_free`]] [unused; errors are translated into
|
|
Boost.MPI exceptions]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node148.html#Node148
|
|
`MPI_Error_string`]] [used internally by Boost.MPI]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node149.html#Node149
|
|
`MPI_Error_class`]] [[memberref boost::mpi::exception::error_class `exception::error_class`]]]
|
|
]
|
|
|
|
The MPI timing facilities are exposed via the Boost.MPI [classref
|
|
boost::mpi::timer `timer`] class, which provides an interface
|
|
compatible with the [@http://www.boost.org/libs/timer/index.html Boost
|
|
Timer library].
|
|
|
|
[table Timing facilities
|
|
[[C Function/Constant] [Boost.MPI Equivalent]]
|
|
|
|
[[`MPI_WTIME_IS_GLOBAL`] [unnecessary; use [memberref
|
|
boost::mpi::timer::time_is_global `timer::time_is_global`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node150.html#Node150
|
|
`MPI_Wtime`]] [use [memberref boost::mpi::timer::elapsed
|
|
`timer::elapsed`] to determine the time elapsed from some specific
|
|
starting point]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node150.html#Node150
|
|
`MPI_Wtick`]] [[memberref boost::mpi::timer::elapsed_min `timer::elapsed_min`]]]
|
|
]
|
|
|
|
MPI startup and shutdown are managed by the construction and
|
|
destruction of the Boost.MPI [classref boost::mpi::environment
|
|
`environment`] class.
|
|
|
|
[table Startup/shutdown facilities
|
|
[[C Function] [Boost.MPI Equivalent]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node151.html#Node151
|
|
`MPI_Init`]] [[classref boost::mpi::environment `environment`]
|
|
constructor]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node151.html#Node151
|
|
`MPI_Finalize`]] [[classref boost::mpi::environment `environment`]
|
|
destructor]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node151.html#Node151
|
|
`MPI_Initialized`]] [[memberref boost::mpi::environment::initialized
|
|
`environment::initialized`]]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node151.html#Node151
|
|
`MPI_Abort`]] [[memberref boost::mpi::environment::abort
|
|
`environment::abort`]]]
|
|
]
|
|
|
|
Boost.MPI does not provide any support for the profiling facilities in
|
|
MPI 1.1.
|
|
|
|
[table Profiling interface
|
|
[[C Function] [Boost.MPI Equivalent]]
|
|
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node153.html#Node153
|
|
`PMPI_*` routines]] [unsupported]]
|
|
[[[@http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node156.html#Node156
|
|
`MPI_Pcontrol`]] [unsupported]]
|
|
]
|
|
|
|
[endsect]
|
|
|
|
[endsect]
|
|
|
|
[xinclude mpi_autodoc.xml]
|
|
|
|
[section:python Python Bindings]
|
|
[python]
|
|
|
|
Boost.MPI provides an alternative MPI interface from the _Python_
|
|
programming language via the `boost.mpi` module. The
|
|
Boost.MPI Python bindings, built on top of the C++ Boost.MPI using the
|
|
_BoostPython_ library, provide nearly all of the functionality of
|
|
Boost.MPI within a dynamic, object-oriented language.
|
|
|
|
The Boost.MPI Python module can be built and installed from the
|
|
`libs/mpi/build` directory. Just follow the [link
|
|
mpi.config configuration] and [link mpi.installation
|
|
installation] instructions for the C++ Boost.MPI. Once you have
|
|
installed the Python module, be sure that the installation location is
|
|
in your `PYTHONPATH`.
|
|
|
|
[section:python_quickstart Quickstart]
|
|
|
|
[python]
|
|
|
|
Getting started with the Boost.MPI Python module is as easy as
|
|
importing `boost.mpi`. Our first "Hello, World!" program is
|
|
just two lines long:
|
|
|
|
import boost.mpi as mpi
|
|
print "I am process %d of %d." % (mpi.rank, mpi.size)
|
|
|
|
Go ahead and run this program with several processes. Be sure to
|
|
invoke the `python` interpreter from `mpirun`, e.g.,
|
|
|
|
[pre
|
|
mpirun -np 5 python hello_world.py
|
|
]
|
|
|
|
This will return output such as:
|
|
|
|
[pre
|
|
I am process 1 of 5.
|
|
I am process 3 of 5.
|
|
I am process 2 of 5.
|
|
I am process 4 of 5.
|
|
I am process 0 of 5.
|
|
]
|
|
|
|
Point-to-point operations in Boost.MPI have nearly the same syntax in
|
|
Python as in C++. We can write a simple two-process Python program
|
|
that prints "Hello, world!" by transmitting Python strings:
|
|
|
|
import boost.mpi as mpi
|
|
|
|
if mpi.world.rank == 0:
|
|
mpi.world.send(1, 0, 'Hello')
|
|
msg = mpi.world.recv(1, 1)
|
|
print msg,'!'
|
|
else:
|
|
msg = mpi.world.recv(0, 0)
|
|
print (msg + ', '),
|
|
mpi.world.send(0, 1, 'world')
|
|
|
|
There are only a few notable differences between this Python code and
|
|
the example [link mpi.point_to_point in the C++
|
|
tutorial]. First of all, we don't need to write any initialization
|
|
code in Python: just loading the `boost.mpi` module makes the
|
|
appropriate `MPI_Init` and `MPI_Finalize` calls. Second, we're passing
|
|
Python objects from one process to another through MPI. Any Python
|
|
object that can be pickled can be transmitted; the next section will
|
|
describe in more detail how the Boost.MPI Python layer transmits
|
|
objects. Finally, when we receive objects with `recv`, we don't need
|
|
to specify the type because transmission of Python objects is
|
|
polymorphic.
|
|
|
|
When experimenting with Boost.MPI in Python, don't forget that help is
|
|
always available via `pydoc`: just pass the name of the module or
|
|
module entity on the command line (e.g., `pydoc
|
|
boost.mpi.communicator`) to receive complete reference
|
|
documentation. When in doubt, try it!
|
|
[endsect]
|
|
|
|
[section:python_user_data Transmitting User-Defined Data]
|
|
Boost.MPI can transmit user-defined data in several different ways.
|
|
Most importantly, it can transmit arbitrary _Python_ objects by pickling
|
|
them at the sender and unpickling them at the receiver, allowing
|
|
arbitrarily complex Python data structures to interoperate with MPI.
|
|
|
|
Boost.MPI also supports efficient serialization and transmission of
|
|
C++ objects (that have been exposed to Python) through its C++
|
|
interface. Any C++ type that provides (de-)serialization routines that
|
|
meet the requirements of the Boost.Serialization library is eligible
|
|
for this optimization, but the type must be registered in advance. To
|
|
register a C++ type, invoke the C++ function [funcref
|
|
boost::mpi::python::register_serialized
|
|
register_serialized]. If your C++ types come from other Python modules
|
|
(they probably will!), those modules will need to link against the
|
|
`boost_mpi` and `boost_mpi_python` libraries as described in the [link
|
|
mpi.installation installation section]. Note that you do
|
|
*not* need to link against the Boost.MPI Python extension module.
|
|
|
|
Finally, Boost.MPI supports separation of the structure of an object
|
|
from the data it stores, allowing the two pieces to be transmitted
|
|
separately. This "skeleton/content" mechanism, described in more
|
|
detail in a later section, is a communication optimization suitable
|
|
for problems with fixed data structures whose internal data changes
|
|
frequently.
|
|
[endsect]
|
|
|
|
[section:python_collectives Collectives]
|
|
|
|
Boost.MPI supports all of the MPI collectives (`scatter`, `reduce`,
|
|
`scan`, `broadcast`, etc.) for any type of data that can be
|
|
transmitted with the point-to-point communication operations. For the
|
|
MPI collectives that require a user-specified operation (e.g., `reduce`
|
|
and `scan`), the operation can be an arbitrary Python function. For
|
|
instance, one could concatenate strings with `all_reduce`:
|
|
|
|
mpi.all_reduce(my_string, lambda x,y: x + y)
|
|
|
|
The following module-level functions implement MPI collectives:
|
|
all_gather Gather the values from all processes.
|
|
all_reduce Combine the results from all processes.
|
|
all_to_all Every process sends data to every other process.
|
|
broadcast Broadcast data from one process to all other processes.
|
|
gather Gather the values from all processes to the root.
|
|
reduce Combine the results from all processes to the root.
|
|
scan Prefix reduction of the values from all processes.
|
|
scatter Scatter the values stored at the root to all processes.
|
|
[endsect]
|
|
|
|
[section:python_skeleton_content Skeleton/Content Mechanism]
|
|
Boost.MPI provides a skeleton/content mechanism that allows the
|
|
transfer of large data structures to be split into two separate stages,
|
|
with the skeleton (or, "shape") of the data structure sent first and
|
|
the content (or, "data") of the data structure sent later, potentially
|
|
several times, so long as the structure has not changed since the
|
|
skeleton was transferred. The skeleton/content mechanism can improve
|
|
performance when the data structure is large and its shape is fixed,
|
|
because while the skeleton requires serialization (it has an unknown
|
|
size), the content transfer is fixed-size and can be done without
|
|
extra copies.
|
|
|
|
To use the skeleton/content mechanism from Python, you must first
|
|
register the type of your data structure with the skeleton/content
|
|
mechanism *from C++*. The registration function is [funcref
|
|
boost::mpi::python::register_skeleton_and_content
|
|
register_skeleton_and_content] and resides in the [headerref
|
|
boost/mpi/python.hpp <boost/mpi/python.hpp>] header.
|
|
|
|
Once you have registered your C++ data structures, you can extract
|
|
the skeleton for an instance of that data structure with `skeleton()`.
|
|
The resulting `skeleton_proxy` can be transmitted via the normal send
|
|
routine, e.g.,
|
|
|
|
mpi.world.send(1, 0, skeleton(my_data_structure))
|
|
|
|
`skeleton_proxy` objects can be received on the other end via `recv()`,
|
|
which stores a newly-created instance of your data structure with the
|
|
same "shape" as the sender in its `"object"` attribute:
|
|
|
|
shape = mpi.world.recv(0, 0)
|
|
my_data_structure = shape.object
|
|
|
|
Once the skeleton has been transmitted, the content (accessed via
|
|
`get_content`) can be transmitted in much the same way. Note, however,
|
|
that the receiver also specifies `get_content(my_data_structure)` in its
|
|
call to receive:
|
|
|
|
if mpi.rank == 0:
|
|
mpi.world.send(1, 0, get_content(my_data_structure))
|
|
else:
|
|
mpi.world.recv(0, 0, get_content(my_data_structure))
|
|
|
|
Of course, this transmission of content can occur repeatedly, if the
|
|
values in the data structure--but not its shape--changes.
|
|
|
|
The skeleton/content mechanism is a structured way to exploit the
|
|
interaction between custom-built MPI datatypes and `MPI_BOTTOM`, to
|
|
eliminate extra buffer copies.
|
|
|
|
[section:python_compatibility C++/Python MPI Compatibility]
|
|
Boost.MPI is a C++ library whose facilities have been exposed to Python
|
|
via the Boost.Python library. Since the Boost.MPI Python bindings are
|
|
build directly on top of the C++ library, and nearly every feature of
|
|
C++ library is available in Python, hybrid C++/Python programs using
|
|
Boost.MPI can interact, e.g., sending a value from Python but receiving
|
|
that value in C++ (or vice versa). However, doing so requires some
|
|
care. Because Python objects are dynamically typed, Boost.MPI transfers
|
|
type information along with the serialized form of the object, so that
|
|
the object can be received even when its type is not known. This
|
|
mechanism differs from its C++ counterpart, where the static types of
|
|
transmitted values are always known.
|
|
|
|
The only way to communicate between the C++ and Python views on
|
|
Boost.MPI is to traffic entirely in Python objects. For Python, this
|
|
is the normal state of affairs, so nothing will change. For C++, this
|
|
means sending and receiving values of type `boost::python::object`,
|
|
from the _BoostPython_ library. For instance, say we want to transmit
|
|
an integer value from Python:
|
|
|
|
comm.send(1, 0, 17)
|
|
|
|
In C++, we would receive that value into a Python object and then
|
|
`extract` an integer value:
|
|
|
|
[c++]
|
|
|
|
boost::python::object value;
|
|
comm.recv(0, 0, value);
|
|
int int_value = boost::python::extract<int>(value);
|
|
|
|
In the future, Boost.MPI will be extended to allow improved
|
|
interoperability with the C++ Boost.MPI and the C MPI bindings.
|
|
[endsect]
|
|
|
|
[section:pythonref Reference]
|
|
The Boost.MPI Python module, `boost.mpi`, has its own
|
|
[@boost.mpi.html reference documentation], which is also
|
|
available using `pydoc` (from the command line) or
|
|
`help(boost.mpi)` (from the Python interpreter).
|
|
|
|
[endsect]
|
|
|
|
[endsect]
|
|
|
|
[section:design Design Philosophy]
|
|
|
|
The design philosophy of the Parallel MPI library is very simple: be
|
|
both convenient and efficient. MPI is a library built for
|
|
high-performance applications, but it's FORTRAN-centric,
|
|
performance-minded design makes it rather inflexible from the C++
|
|
point of view: passing a string from one process to another is
|
|
inconvenient, requiring several messages and explicit buffering;
|
|
passing a container of strings from one process to another requires
|
|
an extra level of manual bookkeeping; and passing a map from strings
|
|
to containers of strings is positively infuriating. The Parallel MPI
|
|
library allows all of these data types to be passed using the same
|
|
simple `send()` and `recv()` primitives. Likewise, collective
|
|
operations such as [funcref boost::mpi::reduce `reduce()`]
|
|
allow arbitrary data types and function objects, much like the C++
|
|
Standard Library would.
|
|
|
|
The higher-level abstractions provided for convenience must not have
|
|
an impact on the performance of the application. For instance, sending
|
|
an integer via `send` must be as efficient as a call to `MPI_Send`,
|
|
which means that it must be implemented by a simple call to
|
|
`MPI_Send`; likewise, an integer [funcref boost::mpi::reduce
|
|
`reduce()`] using `std::plus<int>` must be implemented with a call to
|
|
`MPI_Reduce` on integers using the `MPI_SUM` operation: anything less
|
|
will impact performance. In essence, this is the "don't pay for what
|
|
you don't use" principle: if the user is not transmitting strings,
|
|
s/he should not pay the overhead associated with strings.
|
|
|
|
Sometimes, achieving maximal performance means foregoing convenient
|
|
abstractions and implementing certain functionality using lower-level
|
|
primitives. For this reason, it is always possible to extract enough
|
|
information from the abstractions in Boost.MPI to minimize
|
|
the amount of effort required to interface between Boost.MPI
|
|
and the C MPI library.
|
|
[endsect]
|
|
|
|
[section:threading Threads]
|
|
|
|
There are an increasing number of hybrid parallel applications that mix
|
|
distributed and shared memory parallelism. To know how to support that model,
|
|
one need to know what level of threading support is guaranteed by the MPI
|
|
implementation. There are 4 ordered level of possible threading support described
|
|
by [enumref boost::mpi::threading::level mpi::threading::level].
|
|
At the lowest level, you should not use threads at all, at the highest level, any
|
|
thread can perform MPI call.
|
|
|
|
If you want to use multi-threading in your MPI application, you should indicate
|
|
in the environment constructor your preferred threading support. Then probe the
|
|
one the library did provide, and decide what you can do with it (it could be
|
|
nothing, then aborting is a valid option):
|
|
|
|
#include <boost/mpi/environment.hpp>
|
|
#include <boost/mpi/communicator.hpp>
|
|
#include <iostream>
|
|
namespace mpi = boost::mpi;
|
|
namespace mt = mpi::threading;
|
|
|
|
int main()
|
|
{
|
|
mpi::environment env(mt::funneled);
|
|
if (env.thread_level() < mt::funneled) {
|
|
env.abort(-1);
|
|
}
|
|
mpi::communicator world;
|
|
std::cout << "I am process " << world.rank() << " of " << world.size()
|
|
<< "." << std::endl;
|
|
return 0;
|
|
}
|
|
|
|
|
|
[endsect]
|
|
|
|
[section:performance Performance Evaluation]
|
|
|
|
Message-passing performance is crucial in high-performance distributed
|
|
computing. To evaluate the performance of Boost.MPI, we modified the
|
|
standard [@http://www.scl.ameslab.gov/netpipe/ NetPIPE] benchmark
|
|
(version 3.6.2) to use Boost.MPI and compared its performance against
|
|
raw MPI. We ran five different variants of the NetPIPE benchmark:
|
|
|
|
# MPI: The unmodified NetPIPE benchmark.
|
|
|
|
# Boost.MPI: NetPIPE modified to use Boost.MPI calls for
|
|
communication.
|
|
|
|
# MPI (Datatypes): NetPIPE modified to use a derived datatype (which
|
|
itself contains a single `MPI_BYTE`) rather than a fundamental
|
|
datatype.
|
|
|
|
# Boost.MPI (Datatypes): NetPIPE modified to use a user-defined type
|
|
`Char` in place of the fundamental `char` type. The `Char` type
|
|
contains a single `char`, a `serialize()` method to make it
|
|
serializable, and specializes [classref
|
|
boost::mpi::is_mpi_datatype is_mpi_datatype] to force
|
|
Boost.MPI to build a derived MPI data type for it.
|
|
|
|
# Boost.MPI (Serialized): NetPIPE modified to use a user-defined type
|
|
`Char` in place of the fundamental `char` type. This `Char` type
|
|
contains a single `char` and is serializable. Unlike the Datatypes
|
|
case, [classref boost::mpi::is_mpi_datatype
|
|
is_mpi_datatype] is *not* specialized, forcing Boost.MPI to perform
|
|
many, many serialization calls.
|
|
|
|
The actual tests were performed on the Odin cluster in the
|
|
[@http://www.cs.indiana.edu/ Department of Computer Science] at
|
|
[@http://www.iub.edu Indiana University], which contains 128 nodes
|
|
connected via Infiniband. Each node contains 4GB memory and two AMD
|
|
Opteron processors. The NetPIPE benchmarks were compiled with Intel's
|
|
C++ Compiler, version 9.0, Boost 1.35.0 (prerelease), and
|
|
[@http://www.open-mpi.org/ Open MPI] version 1.1. The NetPIPE results
|
|
follow:
|
|
|
|
[$../../libs/mpi/doc/netpipe.png]
|
|
|
|
There are a some observations we can make about these NetPIPE
|
|
results. First of all, the top two plots show that Boost.MPI performs
|
|
on par with MPI for fundamental types. The next two plots show that
|
|
Boost.MPI performs on par with MPI for derived data types, even though
|
|
Boost.MPI provides a much more abstract, completely transparent
|
|
approach to building derived data types than raw MPI. Overall
|
|
performance for derived data types is significantly worse than for
|
|
fundamental data types, but the bottleneck is in the underlying MPI
|
|
implementation itself. Finally, when forcing Boost.MPI to serialize
|
|
characters individually, performance suffers greatly. This particular
|
|
instance is the worst possible case for Boost.MPI, because we are
|
|
serializing millions of individual characters. Overall, the
|
|
additional abstraction provided by Boost.MPI does not impair its
|
|
performance.
|
|
|
|
[endsect]
|
|
|
|
[section:history Revision History]
|
|
|
|
* *Boost 1.36.0*:
|
|
* Support for non-blocking operations in Python, from Andreas Klöckner
|
|
|
|
* *Boost 1.35.0*: Initial release, containing the following post-review changes
|
|
* Support for arrays in all collective operations
|
|
* Support default-construction of [classref boost::mpi::environment environment]
|
|
|
|
* *2006-09-21*: Boost.MPI accepted into Boost.
|
|
|
|
[endsect]
|
|
|
|
[section:acknowledge Acknowledgments]
|
|
Boost.MPI was developed with support from Zurcher Kantonalbank. Daniel
|
|
Egloff and Michael Gauckler contributed many ideas to Boost.MPI's
|
|
design, particularly in the design of its abstractions for
|
|
MPI data types and the novel skeleton/context mechanism for large data
|
|
structures. Prabhanjan (Anju) Kambadur developed the predecessor to
|
|
Boost.MPI that proved the usefulness of the Serialization library in
|
|
an MPI setting and the performance benefits of specialization in a C++
|
|
abstraction layer for MPI. Jeremy Siek managed the formal review of Boost.MPI.
|
|
|
|
[endsect]
|
|
[endsect]
|