Reputation: 977
I have a Python application that needs to load the same large array (~4 GB) and do a perfectly parallel function on chunks of this array. The array starts off saved to disk.
I typically run this application on a cluster computer with something like, say, 10 nodes, each node of which has 8 compute cores and a total RAM of around 32GB.
The easiest approach (which doesn't work) is to do
n=80 mpi4py. The reason it doesn't work is that
each MPI core will load the 4GB map, and this will exhaust
the 32GB of RAM resulting in a MemoryError
.
An alternative is that rank=0
is the only process that loads
the 4GB array, and it farms out chunks of the array to the rest
of the MPI cores -- but this approach is slow because of network
bandwidth issues.
The best approach would be if only 1 core in each node loads
the 4GB array and this array is made available as shared memory
(through multiprocessing
?) for the remaining 7 cores on each
node.
How can I achieve this? How can I have MPI be aware of nodes
and make it coordinate with multiprocessing
?
Upvotes: 2
Views: 1431
Reputation: 4467
Edit 16/06/2020: starting python3.8
, there is now a shared memory support in multiprocessing
: https://docs.python.org/3/library/multiprocessing.shared_memory.html that can be used to have such per node version of the data.
The multiprocessing
module does not have shared memory.
You could look at joblib
way to share large numpy arrays, using memory views. You could use manual memory mapping to avoid duplicating the data.
To find a way to only pass the data once on each node, I would go with launching one MPI process per node and then use joblib
for the remaining computation, as it automatically uses the memmaping
for large numpy array input.
Upvotes: 1
Reputation: 5794
MPI-3 has a shared memory facility for precisely your sort of scenario. And you can use MPI through mpi4py....
Use MPI_Comm_split_type
to split your communicator into groups that live on a node. Use MPI_Win_allocate_shared
for a window on the node; specify nonzero size only on one rank. Use MPI_Win_shared_query
to get pointers to that window.
Upvotes: 1