Anthony Bak
Anthony Bak

Reputation: 1143

Shared Non-Contiguous-Access Numpy Array

I have a numpy array that I would like to share between a bunch of python processes in a way that doesn't involve copies. I create a shared numpy array from an existing numpy array using the sharedmem package.

import sharedmem as shm
def convert_to_shared_array(A):
    shared_array = shm.shared_empty(A.shape, A.dtype, order="C")
    shared_array[...] = A
    return shared_array

My problem is that each subprocess needs to access rows that are randomly distributed in the array. Currently I create a shared numpy array using the sharedmem package and pass it to each subprocess. Each process also has a list, idx, of rows that it needs to access. The problem is in the subprocess when I do:

#idx = list of randomly distributed integers

local_array = shared_array[idx,:]

# Do stuff with local array

It creates a copy of the array instead of just another view. The array is quite large and manipulating it first before shareing it so that each process accesses a contiguous range of rows like

local_array = shared_array[start:stop,:]

takes too long.

Question: What are good solutions for sharing random access to a numpy array between python processes that don't involve copying the array?

The subprocesses need readonly access (so no need for locking on access).

Upvotes: 5

Views: 620

Answers (1)

David Cournapeau
David Cournapeau

Reputation: 80770

Fancy indexing induces a copy, so you need to avoid fancy indexing if you want to avoid copies there is no way around it.

Upvotes: 1

Related Questions