Tim P
Tim P

Reputation: 425

How to convert PyArrow table to Arrow table when interfacing between PyArrow in python and Arrow in C++

I have a C++ library which is built against the Apache Arrow C++ libraries, with a binding to python using Pybind. I'd like to be able to write a function in C++ to take a table constructed with PyArrow, like:

void test(arrow::Table test);

Passing in a PyArrow table like:

tab = pa.Table.from_pandas(df)
mybinding.test(tab)

If I do a naive function as above, I get:

TypeError: arrow_test(): incompatible function arguments. The following argument types are supported:
    1. (arg0: arrow::Table) -> None

Invoked with: pyarrow.Table

I've also tried to write a function that takes a py::object and .cast<arrow::Table>() but I can't do the casting:

RuntimeError: Unable to cast Python instance to C++ type (compile in debug mode for details)

Does anyone have any idea how to get this to work?

Upvotes: 2

Views: 1887

Answers (1)

Uwe L. Korn
Uwe L. Korn

Reputation: 8796

You have to use the functionality provided in the arrow/python/pyarrow.h header. This header is auto-generated to support unwrapping the Cython pyarrow.Table objects to C++ arrow::Table instances. It is sufficient to build and link to libarrow.so. It will also require the pyarrow python packages loaded but this is solely a runtime, not a compile-time dependency.

// header that 
#include <arrow/python/pyarrow.h>

// Ensure that the Python module was loaded
arrow::py::import_pyarrow();

PyObject* pyarrow_table = …
// With pybind11 you can also use
// pybind11::object pyarrow_table = …

// Convert PyObject* to native C++ object
std::shared_ptr<Table> table = unwrap_pyarrow_table(pyarrow_table);

Upvotes: 1

Related Questions