Reputation: 2336
I want to reduce a memory copy step during my data processing pipeline.
I want to do the following:
Generate some data from a custom C library
Feed the generated data into a MXNet model running on GPU.
For now, my pipeline does the following:
Create a C-contiguous numpy array via np.empty(...)
.
Get the pointer to numpy array via np.ndarray.__array_interface__
Call the C library from python (via ctypes) to fill the numpy array.
Convert the numpy array into mxnet NDArray
, this will copy the underlying memory buffer.
Pack NDArray
s into a mx.io.DataBatch
instance, then feed into model.
Please note, before being fed into model, all arrays stay in CPU memory.
I noticed a mx.io.DataBatch
can only take a list of mx.ndarray.NDArray
s as data
and label
parameter, but not numpy arrays. It works until you feed it into a model. On the other hand, I have a C library that can write directly to a C-contiguous array.
I would like to avoid the memory copying in step 3. One possible way is somehow getting a raw pointer to buffer of NDArray
, while totally ignoring numpy. But whatever works.
Upvotes: 1
Views: 662
Reputation: 2336
I figured out a hacky way to achieve this. Here's a small example.
from ctypes import *
import numpy as np
import mxnet as mx
m = mx.ndarray.zeros((4,4))
m.wait_to_read() # make sure the data is allocated
c_uint64_p = POINTER(c_uint64)
handle= cast(m.handle, c_uint64_p) # NDArray*
ptr_ = cast(handle[0], c_uint64_p) # shared_ptr<Chunk>
dptr = cast(ptr_[0], POINTER(c_float)) # shandle.dptr
n = np.ctypeslib.as_array(dptr, shape=(4,4)) # m and n will share buffer
I derived the above code by looking at MxNet C++ source code. Some explanation:
First, note the NDArray.handle
attribute. It's a c_void_p
. Read the python source code, you will know it's NDArrayHandle
. Now dive into src/c_api/c_api_ndarray.cc
code, it's reinterpreted as NDArray*
.
In the source tree, go to include/mxnet/ndarray.h
and find NDArray
class. The first field is:
/*! \brief internal data of NDArray */
std::shared_ptr<Chunk> ptr_{nullptr};
Checking Chunk
, which is a struct defined inside NDArray
, we see:
/*! \brief the real data chunk that backs NDArray */
// shandle is used to store the actual values in the NDArray
// aux_handles store the aux data(such as indices) if it's needed by non-default storage.
struct Chunk {
/*! \brief storage handle from storage engine.
for non-default storage, shandle stores the data(value) array.
*/
Storage::Handle shandle;
Finally, shandle
is defined in include/mxnet/storage.h
:
struct Handle {
/*!
* \brief Pointer to the data.
*/
void* dptr{nullptr};
Writing a small program shows sizeof(shared_ptr<some_type>)
is 16. Based on this question, we can guess shared_ptr
is composed of two pointers. It's not too hard to figure out the first pointer is the pointer to data. Putting everything together, all that needed are two pointer de-referencing.
On the down site, this method cannot be used in production environment or large projects. It could break in future release, or introduce tough bugs and security holes.
Upvotes: 2