cl-mpi provides convenient CFFI bindings for the Message Passing Interface (MPI). MPI is typically used in High Performance Computing to utilize big parallel computers with thousands of cores. It features minimal communication overhead with a latency in the range of microseconds. In comparison to the C or FORTRAN interface of MPI, cl-mpi relieves the programmer from working with raw pointers to memory and a plethora of mandatory function arguments.
If you have questions or suggestions, feel free to contact me (email@example.com).
cl-mpi has been tested with MPICH, MPICH2, IntelMPI and Open MPI.
An MPI program must be launched with
mpiexec. These commands spawn multiple processes depending on your system and commandline parameters. Each process is identical, except that it has a unique rank that can be queried with
(MPI-COMM-RANK). The ranks are assigned from 0 to
(- (MPI-COMM-SIZE) 1). A wide range of communication functions is available to transmit messages between different ranks. To become familiar with cl-mpi, see the examples directory.
The easiest way to deploy and run cl-mpi applications is by creating a statically linked binary. To do so, create a separate ASDF system like this:
(defsystem :my-mpi-app :depends-on (:cl-mpi) :defsystem-depends-on (:cl-mpi-asdf-integration) :class :mpi-program :build-operation :static-program-op :build-pathname "my-mpi-app" :entry-point "my-mpi-app:main" :serial t :components ((:file "foo") (:file "bar")))
and simply run
on the REPL. Note that not all Lisp implementation support the creation of statically linked binaries (actually, we only tested SBCL so far). Alternatively, you can try to use uiop:dump-image to create binaries.
Further remark: If the creation of statically linked binaries with SBCL fails with something like "undefined reference to main", your SBCL is probably not built with the
:sb-linkable-runtime feature. You are affected by this when
(find :sb-linkable-runtime *features*) returns NIL. In that case, you have to compile SBCL yourself, which is as simple as executing the following commands (where SOMEWHERE is the desired installation folder):
git clone git://git.code.sf.net/p/sbcl/sbcl cd sbcl sh make.sh --prefix=SOMEWHERE --fancy --with-sb-linkable-runtime --with-sb-dynamic-core cd tests && sh run-tests.sh sh install.sh
To run the test suite:
cl-mpi makes no additional copies of transmitted data and has therefore the same bandwidth as any other language (C, FORTRAN). However the convenience of error handling, automatic inference of the message types and safe computation of memory locations adds a little overhead to each message. The exact overhead varies depending on the Lisp implementation and platform but is somewhere around 1000 machine cycles.Summary:
- latency increase per message: 400 nanoseconds (SBCL on a 2.4GHz Intel i7-5500U)
- bandwidth unchanged
- Alex Fukunaga
- Marco Heisig
This project was funded by KONWIHR (The Bavarian Competence Network for Technical and Scientific High Performance Computing) and the Chair for Applied Mathematics 3 of Prof. Dr. B?nsch at the FAU Erlangen-N?rnberg.
Big thanks to Nicolas Neuss for all the useful suggestions.
- Marco Heisig <firstname.lastname@example.org>