Chun-Ye Lu
Chun-Ye Lu

Reputation: 385

python use ctypes to pass 2-d array into c function

I want to use C to deal with some computations. For example, I have a C function of adding two matrix:

// mat_add.c
#include <stdlib.h>

void matAdd(int ROW, int COL, int x[][COL], int y[][COL], int z[][COL]){
    int i, j;
    for (i = 0; i < ROW; i++){
        for (j = 0; j < COL; j++){
            z[i][j] = x[i][j] + y[j][j];
        }
    }
}

Then I compiled it into .so file:
gcc -shared -fPIC mat_add.c -o mat_add.so

And in python:

# mat_add_test.py
import ctypes
import numpy as np

def cfunc(x, y):
    nrow, ncol = x.shape
    
    objdll = ctypes.CDLL('./mat_add.so')
    
    func = objdll.matAdd
    func.argtypes = [
        ctypes.c_int,
        ctypes.c_int,
        np.ctypeslib.ndpointer(dtype=np.int, ndim=2, shape=(nrow, ncol)),
        np.ctypeslib.ndpointer(dtype=np.int, ndim=2, shape=(nrow, ncol)),
        np.ctypeslib.ndpointer(dtype=np.int, ndim=2, shape=(nrow, ncol))
    ]
    func_restype = None
    
    z = np.empty_like(x)
    func(nrow, ncol, x, y, z)
    return z


if __name__ == '__main__':
    x = np.array([[1, 2], [3, 4]], dtype=np.int)
    y = np.array([[2, 2], [5, 6]], dtype=np.int)
    z = cfunc(x, y)
    print(z)
    print('end')

Executed this python file, I obtained:

$ python mat_add_test.py 
[[                  3                   4]
 [8386863780988286322 7813586346238636153]]
end

The first row of return matrix is correct, but the second row is wrong. I guess that I don't successfully update the value in z, but I have no idea where the problem is.
Can anyone help? Very thanks!

Upvotes: 1

Views: 339

Answers (1)

alani
alani

Reputation: 13049

The handling of 2d array in the question is correct. The only problem (apart from a typo in how the C code indexes the y array - y[j][j] should be y[i][j]) is that np.int is np.int64 so this does not correspond to a C int.

To ensure that the types match, an explicit length can be specified in both languages.

In Python: use np.int32 or np.int64 explicitly (instead of np.int).

In C: #include <stdint.h> and then use int32_t or int64_t correspondingly (possibly via a typedef), instead of int.

Then the problem goes away.

For ROW and COL, these are call by value so it is less important (provided of course that the values do not overflow).

What is happening here

In reality a 2d array is still just a 1d sequence of values in memory; the 2 dimensions are just a convenient way to index it.

So in numpy the arrays just before calling C are (in hex):

0000000000000001 0000000000000002 0000000000000003 0000000000000004  <== x
0000000000000002 0000000000000002 0000000000000005 0000000000000006  <== y
UUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUU  <== z 

where U means undefined / uninitialised data

but in the C code (assuming little endian), treating the arrays as 32-bit, it sees:

inputs
00000001 00000000 00000002 00000000 00000003 00000000 00000004 00000000  <== x
00000002 00000000 00000002 00000000 00000005 00000000 00000006 00000000  <== y
UUUUUUUU UUUUUUUU UUUUUUUU UUUUUUUU UUUUUUUU UUUUUUUU UUUUUUUU UUUUUUUU  <== z at start

Then the C code loops over the first 4 elements of each, performing additions, so this produces:

00000003 00000000 00000004 00000000 UUUUUUUU UUUUUUUU UUUUUUUU UUUUUUUU  <== z at end

and back in numpy using a 64-bit int type, now we see:

0000000000000003 0000000000000004 UUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUU  <== output z

Interpreted as a 2-d array, this is array([[3, 4], [whatever, whatever]])

Upvotes: 2

Related Questions