Dave
Dave

Reputation: 8099

Make a C++ class look like a numpy array using swig

What's a good way to expose a C++ class that provides an array-like interface for use with numpy (scipy)?

By array-like interface I mean something like:

//file:Arr.h
class Arr{
public:
    int n_rows;
    int n_cols;
    float* m_data;

    Arr(int r, int c, float v);
    virtual ~Arr();
    float get(int i, int j);
    void set(int i, int j, float v);

    long data_addr(){
        return (long)(m_data);
    }
};

Constraints:

My current approach is to put a pythoncode block in my SWIG .i file that looks something like

%pythoncode{
def arraylike_getitem(self, arg1,arg2 ):
   # the actual implementation to handle slices
   # is pretty complicated but involves:
   # 1. constructing an uninitialized  numpy array for return value
   # 2. iterating over the indices indicated by the slices,
   # 3. calling self.getValue for each of the index pairs,
   # 4. returning the array

# add the function to the ArrayLike class
Arr.__getitem__=arraylike_getitem
%}

where ArrayLike is the C++ class that holds the numerical data (as a flat array), and provides member functions to get/set individual values.

The main drawback is step 1. above: I have to make a copy of any slice that I take of my c-array class. (The main advantage is that by returning a numpy array object, I know that I can use it in any numpy operations that I want.)

I can imagine two approaches for improving this:

  1. Adding (via SWIG %extend) additional functionality to the c class, and or
  2. having the python function return an array-slice proxy object,

My main hang-up is not knowing what interface an object needs to (efficiently) implement in order to quack like a numpy array.

Test Case

Here's my test setup:

//file:Arr.h
class Arr{
public:
    int n_rows;
    int n_cols;
    float* m_data;

    Arr(int r, int c, float v);
    virtual ~Arr();
    float get(int i, int j);
    void set(int i, int j, float v);

    long data_addr(){
        return (long)(m_data);
    }
};

//-----------------------------------------------------------

//file Arr.cpp
#include "Arr.h"

Arr::Arr(int r, int c, float v): n_rows(r), n_cols(c), m_data(0){
    m_data=new float[ r*c ];
    for( int i=0; i<r*c; ++i){
        m_data[i]=v;
    }
}  
Arr::~Arr(){
    delete[] m_data;
}

float Arr::get(int i, int j){
    return m_data[ i*n_cols+j];
}
void Arr::set(int i, int j, float v){
    m_data[i*n_cols+j]=v;
}

//--------------------------------------------------------------------
//file:arr.i
%module arr

%{
#include "Arr.h"
#include </usr/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h>
#include <python2.7/Python.h>
%}

%include "Arr.h"


%pythoncode{

# Partial solution (developed in constructing the question): allows operations between 
# arr objects and numpy arrays (e.g. numpy_array+arr_object is OK)
# but does not allow slicing (e.g. numpy_array[::2,::2]+arr_objec[::2,::2])
# TODO: figure out how to get slices without copy memory
def arr_interface_map(self):
    res={ 'shape':(self.n_rows, self.n_cols), 'typestr':'<f4', 'data': self.data_addr(),0), 'version':3 }
    return res

Arr.__array_interface__=property( arr_interface_map )


}

//---------------------------------------------------------
#file: Makefile
INCLUDE_FLAGS = -I/usr/include/python2.7 

arr_wrap.cpp: arr.i Arr.h
     swig -c++ -python -o $@ ${INCLUDE_FLAGS} arr.i

_arr.so: arr_wrap.o Arr.o
    g++ -shared -o _arr.so arr_wrap.o Arr.o 

clean:
    rm -f *.o *_wrap.cpp *.so

all: _arr.so

If I can get this Arr class to work with numpy, then I've succeeded.

Edit: From this related question it looks like __array_interface__ will be part of the solution (TBD: how to use it?)

Upvotes: 4

Views: 1171

Answers (1)

user4815162342
user4815162342

Reputation: 155366

If n_cols and n_rows are (effectively) immutable, your best course of action is to simply create a real numpy array, giving it m_data as storage and (n_rows, n_cols) as shape. That way you will get all the numpy array facilities without any copying and without having to reimplement them in your own code (which would be a lot of quacking to imitate).

PyObject* array_like_to_numpy(ArrayLike& obj)
{
    npy_intp dims[] = { obj.n_rows, obj.n_cols };
    return PyArray_SimpleNewFromData(2, dims, NPY_FLOAT, obj.m_data);
}

Of course, this won't work as written, since your m_data member is protected. But it would be a good idea to either make it public or provide an accessor to retrieve it (or inherit from ArrayLike and provide such functionality in your subclass).

Upvotes: 4

Related Questions