Kh40tiK
Kh40tiK

Reputation: 2336

avoid copying during array conversion between numpy and mxnet

I want to reduce a memory copy step during my data processing pipeline.

I want to do the following:

  1. Generate some data from a custom C library

  2. Feed the generated data into a MXNet model running on GPU.

For now, my pipeline does the following:

  1. Create a C-contiguous numpy array via np.empty(...).

  2. Get the pointer to numpy array via np.ndarray.__array_interface__

  3. Call the C library from python (via ctypes) to fill the numpy array.

  4. Convert the numpy array into mxnet NDArray, this will copy the underlying memory buffer.

  5. Pack NDArrays into a mx.io.DataBatch instance, then feed into model.

Please note, before being fed into model, all arrays stay in CPU memory.

I noticed a mx.io.DataBatch can only take a list of mx.ndarray.NDArrays as data and label parameter, but not numpy arrays. It works until you feed it into a model. On the other hand, I have a C library that can write directly to a C-contiguous array.

I would like to avoid the memory copying in step 3. One possible way is somehow getting a raw pointer to buffer of NDArray, while totally ignoring numpy. But whatever works.

Upvotes: 1

Views: 662

Answers (1)

Kh40tiK
Kh40tiK

Reputation: 2336

I figured out a hacky way to achieve this. Here's a small example.

from ctypes import *
import numpy as np
import mxnet as mx

m = mx.ndarray.zeros((4,4))
m.wait_to_read() # make sure the data is allocated

c_uint64_p = POINTER(c_uint64)

handle= cast(m.handle, c_uint64_p) # NDArray*
ptr_  = cast(handle[0], c_uint64_p) # shared_ptr<Chunk>
dptr = cast(ptr_[0], POINTER(c_float)) # shandle.dptr

n = np.ctypeslib.as_array(dptr, shape=(4,4)) # m and n will share buffer

I derived the above code by looking at MxNet C++ source code. Some explanation:

First, note the NDArray.handle attribute. It's a c_void_p. Read the python source code, you will know it's NDArrayHandle. Now dive into src/c_api/c_api_ndarray.cc code, it's reinterpreted as NDArray*.

In the source tree, go to include/mxnet/ndarray.h and find NDArray class. The first field is:

/*! \brief internal data of NDArray */
std::shared_ptr<Chunk> ptr_{nullptr};

Checking Chunk, which is a struct defined inside NDArray, we see:

  /*! \brief the real data chunk that backs NDArray */
  // shandle is used to store the actual values in the NDArray
  // aux_handles store the aux data(such as indices) if it's needed by non-default storage.
  struct Chunk {
    /*! \brief storage handle from storage engine.
               for non-default storage, shandle stores the data(value) array.
     */
    Storage::Handle shandle;

Finally, shandle is defined in include/mxnet/storage.h:

  struct Handle {
    /*!
     * \brief Pointer to the data.
     */
    void* dptr{nullptr};

Writing a small program shows sizeof(shared_ptr<some_type>) is 16. Based on this question, we can guess shared_ptr is composed of two pointers. It's not too hard to figure out the first pointer is the pointer to data. Putting everything together, all that needed are two pointer de-referencing.


On the down site, this method cannot be used in production environment or large projects. It could break in future release, or introduce tough bugs and security holes.

Upvotes: 2

Related Questions