Reputation: 17631
I'm unlucky enough to be converting some MATLAB code into Python via numpy arrays.
Is there any consensus on num2cell()
?
Personally, I think this goes against Python/numpy syntax. The idea is this:
Using num2cell
, you'll end up in an array that looks like this
array([[0],[1],[2],[3],[4],[5],[6],[7],[8]])
See the MathWorks documentaiton.
You could do this in numpy with a list comprehension:
matlab_lunacy = np.array([[x] for x in range(0, 9)]
But why do MATLAB users use this data structure?
What's the numpy equivalent?
Upvotes: 3
Views: 1865
Reputation: 231385
In the good old days (around v. 3.0) MATLAB had only one data structure, a matrix. It could contain numbers or characters, and was always 2d.
Cells were added to contain more general objects, including matrices and strings. They were still 2d.
Python had lists, which are 1d, but can contain anything. numpy
is built on Python, adding the multidimensional arrays. But lists are still available.
So potentially anything that converts an array to a list is an equivalent to num2cell
- not exact, but with overlapping functionality.
In [246]: A=np.arange(24).reshape(2,3,4) # 3d array
Wrapping in a list, gives us a list of 2 arrays (2d):
In [247]: B=list(A)
In [248]: B
Out[248]:
[array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]),
array([[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])]
tolist
method performs complete conversion to lists (nested).
In [249]: C=A.tolist()
In [250]: C
Out[250]:
[[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]],
[[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]
list(A)
is not common, and may be used in error when tolist
is meant.
np.split(A,...)
is similar to B
, but the subarrays are still 3d.
unpacking
even works, basically because A
is an iterable, [a for a in A]
splits A
on the 1st dimension.
In [257]: a,b=A
In [258]: a
Out[258]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
There is an object dtype, with lets you put objects, including other arrays, in an array. But as has been shown in many SO questions, these can be tricky to construct. np.array
tries to construct the highest dimension array possible. You have to perform some tricks to get around that.
In [259]: Z=np.empty((2,),dtype=object)
In [260]: Z
Out[260]: array([None, None], dtype=object)
In [261]: Z[0]=A[0]
In [262]: Z[1]=A[1]
In [263]: Z
Out[263]:
array([ array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]),
array([[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])], dtype=object)
================
In an Octave session:
>> anum = [1,2,3,4]
anum =
1 2 3 4
>> acell = num2cell(anum)
acell =
{
[1,1] = 1
[1,2] = 2
[1,3] = 3
[1,4] = 4
}
>> save -7 test.mat anum acell
the scipy.io.loatmat
version
In [1822]: data = io.loadmat('../test.mat')
In [1823]: data
Out[1823]:
{'__globals__': [],
'__header__': b'MATLAB 5.0 MAT-file, written by Octave 4.0.0,
2016-10-27 00:59:27 UTC',
'__version__': '1.0',
'acell': array([[array([[ 1.]]), array([[ 2.]]), array([[ 3.]]),
array([[ 4.]])]], dtype=object),
'anum': array([[ 1., 2., 3., 4.]])}
The matrix
is rendered as a 2d array; the cell
as an object type array (2d), containing, in this case, 2d arrays.
Upvotes: 4
Reputation: 3052
A completely different route to take is to use matlabs python interface, which allows you to call the matlab engine and thereby also matlab codes from python. See https://se.mathworks.com/help/matlab/matlab-engine-for-python.html
Upvotes: -2
Reputation: 97571
.astype(np.object_)
is most likely the thing you need. Consider this matlab code:
x = [1 2 3 4]
y = num2cell(x)
y(end) = 'hello'
In numpy, that translates to:
x = np.array([1, 2, 3, 4])
y = x.astype(np.object_)
y[-1] = 'hello'
Upvotes: 1
Reputation: 104483
num2cell
essentially takes each element in an array and presents it as individual cells in a cell matrix. The Python equivalent to a cell array is a list. Therefore, if you really wanted to create a num2cell
equivalent in Python, you would create a list that has the same dimensions as your Python NumPy array and ensure that each element in this list goes in the right location. Something like this would work:
import numpy as np
def num2cell(a):
if type(a) is np.ndarray:
return [num2cell(x) for x in a]
else:
return a
This recursively goes through each dimension of your NumPy array. For each element in a dimension in this array, if the element is also an array, then for each element in the next dimension, convert into a list representation. The base case is when we actually hit an actual number and if that's the case, just return the actual number.
Here's a working example after defining num2cell
in my Python workspace:
In [26]: import numpy as np
In [27]: A = np.random.rand(4,3,3)
In [28]: B = num2cell(A)
In [29]: A[0]
Out[29]:
array([[ 0.52971132, 0.91969837, 0.77649566],
[ 0.51080951, 0.8086879 , 0.61840573],
[ 0.7291165 , 0.0492292 , 0.53997368]])
In [30]: B[0]
Out[30]:
[[0.52971132352406691, 0.91969837282865874, 0.77649565991300817],
[0.51080951338602765, 0.80868789862631529, 0.61840573261134801],
[0.72911649507775378, 0.049229201932639577, 0.53997367763478676]]
In [31]: A[1][1]
Out[31]: array([ 0.41724412, 0.94679946, 0.79899245])
In [32]: B[1][1]
Out[32]: [0.41724411973558406, 0.9467994633124529, 0.7989924496851234]
We can see here that B
is list representation of the NumPy array A
.
Upvotes: 2
Reputation: 8917
Depends on what the rest of the code is doing. Generally speaking Matlab uses cells to represent an array of arrays, where the inner arrays can have different sizes and shapes.
To answer your question though, I think that what you've done is basically what you want to do, i.e. create an array of arrays.
Upvotes: 1