Reputation: 159
I'm currently embedding Python in my C++ program using boost/python in order to use matplotlib. Now I'm stuck at a point where I have to construct a large data structure, let's say a dense 10000x10000 matrix of doubles. I want to plot columns of that matrix and I figured that i have multiple options to do so:
Which of these is the best option concerning performance and of course memory consumption or is there an even better way of doing that kind of task.
Upvotes: 3
Views: 1285
Reputation: 159
Altough Tanner's answer brought me a big step forward, I ended up using Boost.NumPy, an inofficial extension to Boost.Python that can easily be added. It wraps around the NumPy C API and makes it more save and easier to use.
Upvotes: 1
Reputation: 51891
To prevent copying in Boost.Python, one can either:
If the matrix has a C-style contiguous memory layout, then consider using the Numpy C-API. The PyArray_SimpleNewFromData()
function can be used to create an ndarray object thats wraps memory that has been allocated elsewhere. This would allow one to expose the data to Python without requiring copying or transferring each element between the languages. The how to extend documentation is a great resource for dealing with the Numpy C-API:
Sometimes, you want to wrap memory allocated elsewhere into an ndarray object for downstream use. This routine makes it straightforward to do that. [...] A new reference to an ndarray is returned, but the ndarray will not own its data. When this ndarray is deallocated, the pointer will not be freed.
[...] If you want the memory to be freed as soon as the ndarray is deallocated then simply set the
OWNDATA
flag on the returned ndarray.
Also, while the plotting function may create copies of the array, it can do so within the C-API, allowing it to take advantage of the memory layout.
If performance is a concern, it may be worth considering the plotting itself:
Upvotes: 3