MPI
最后发布时间 : 2022-10-20 13:44:38
浏览量 :
sudo apt install openmpi-bin libopenmpi-dev
http://www.xtaohub.com/
https://blog.csdn.net/qq_22370527/article/details/109129567
https://blog.csdn.net/weixin_40729260/article/details/125435633
https://blog.csdn.net/qq_22370527/article/details/109129567
https://blog.csdn.net/weixin_40729260/article/details/125435633
https://www.cnblogs.com/aobaxu/p/16195237.html
srun --mpi=list
https://bugs.schedmd.com/show_bug.cgi?id=7236
alex@polaris:~/slurm/19.05/install/lib$ ls -l
total 70992
-rw-r--r-- 1 alex alex 62102336 Jun 13 16:36 libslurm.a
-rwxr-xr-x 1 alex alex 987 Jun 13 16:36 libslurm.la
lrwxrwxrwx 1 alex alex 18 Jun 13 16:36 libslurm.so -> libslurm.so.34.0.0
lrwxrwxrwx 1 alex alex 18 Jun 13 16:36 libslurm.so.34 -> libslurm.so.34.0.0
-rwxr-xr-x 1 alex alex 10562200 Jun 13 16:36 libslurm.so.34.0.0
drwxr-xr-x 3 alex alex 20480 Jun 13 16:38 slurm
alex@polaris:~/slurm/19.05/install/lib$
(there's no libpmi nor libpmi2)
You can manually install libpmi or libpmi2 shipped with Slurm by going building the contribs/pmi and contribs/pmi2 respectively. Here's an example of installing libpmi2:
alex@polaris:~/slurm/19.05/build/contribs/pmi2$ make -j install
alex@polaris:~/slurm/19.05/install/lib$ ls -l
total 71688
-rw-r--r-- 1 alex alex 490536 Jun 13 18:17 libpmi2.a
-rwxr-xr-x 1 alex alex 961 Jun 13 18:17 libpmi2.la
lrwxrwxrwx 1 alex alex 16 Jun 13 18:17 libpmi2.so -> libpmi2.so.0.0.0
lrwxrwxrwx 1 alex alex 16 Jun 13 18:17 libpmi2.so.0 -> libpmi2.so.0.0.0
-rwxr-xr-x 1 alex alex 214400 Jun 13 18:17 libpmi2.so.0.0.0
-rw-r--r-- 1 alex alex 62102336 Jun 13 16:36 libslurm.a
-rwxr-xr-x 1 alex alex 987 Jun 13 16:36 libslurm.la
lrwxrwxrwx 1 alex alex 18 Jun 13 16:36 libslurm.so -> libslurm.so.34.0.0
lrwxrwxrwx 1 alex alex 18 Jun 13 16:36 libslurm.so.34 -> libslurm.so.34.0.0
-rwxr-xr-x 1 alex alex 10562200 Jun 13 16:36 libslurm.so.34.0.0
drwxr-xr-x 3 alex alex 20480 Jun 13 16:38 slurm
alex@polaris:~/slurm/19.05/install/lib$
[wangyang-PC:331801] OPAL ERROR: Unreachable in file ext3x_client.c at line 112
--------------------------------------------------------------------------
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:
version 16.05 or later: you can use SLURM's PMIx support. This
requires that you configure and build SLURM --with-pmix.
Versions earlier than 16.05: you must use either SLURM's PMI-1 or
PMI-2 support. SLURM builds PMI-1 by default, or you can manually
install PMI-2. You must then build Open MPI using --with-pmi pointing
to the SLURM PMI library location.
Please configure as appropriate and try again.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[wangyang-PC:331801] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
srun: error: wangyang-PC: task 0: Exited with exit code 1
$ cat>hello.cpp<<EOF
#include "mpi.h"
#include <iostream>
int main(int argc, char* argv[])
{
int rank;
int size;
MPI_Init(0,0);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
std::cout<<"Hello world from process "<<rank<<" of "<<size<<std::endl;
MPI_Finalize();
return 0;
}
EOF
$ mpicxx hello.cpp -o hello
$ srun -p test --mpi=pmi2 -n 4 ./hello
Hello world from process 1 of 4
Hello world from process 2 of 4
Hello world from process 0 of 4
Hello world from process 3 of 4
python mpi
https://blog.csdn.net/weixin_39594457/article/details/110780781
cat >env.sh <<EOF
#!/bin/bash
UCX=/usr/local/ucx
OPENMPI=/usr/local/openmpi
PMIX3=/usr/local/pmix3
LIBEVENT=/usr/local/libevent
HWLOC=/usr/local/hwloc
export OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1
export PATH=$UCX/bin:$OPENMPI/bin:$PMIX3/bin:$LIBEVENT/bin:$HWLOC/bin:$PATH
export LD_LIBRARY_PATH=$UCX/lib:$OPENMPI/lib:$PMIX3/lib:$LIBEVENT/lib:$HWLOC/lib:$LD_LIBRARY_PATH
export OMPI_ALLOW_RUN_AS_ROOT=1
export OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1
EOF
source env.sh
cd openmpi-4.0.5
cd examples
make hello_c
[root@mn0 examples]# mpirun -np 1 ./hello_c
Hello, world, I am 0 of 1, (Open MPI v4.0.5, package: Open MPI root@mn0 Distribution, ident: 4.0.5, repo rev: v4.0.5, Aug 26, 2020, 103)
from mpi4py import MPI
import sys
import time
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()
sys.stdout.write("Hello, World! I am process %d of %d on %s.\n" % (rank, size, name))
time.sleep(2)
mpirun python demo.py
srun -n 2 --mpi=pmix python demo.py
https://mpi4py.readthedocs.io/en/stable/index.html
/data3/cluster/anaconda3/bin/mpicc: line 301: x86_64-conda_cos6-linux-gnu-cc: command not found
failure.
conda install gxx_linux-64
https://github.com/RcppCore/Rcpp/issues/770
https://hcc.unl.edu/docs/submitting_jobs/