Reputation: 3781
Why the following works:
mat = np.array(
[(0,0,0),
(0,0,0),
(0,0,0)],
dtype=[('MSFT','float'),('CSCO','float'),('GOOG','float') ]
)
while this doesn't:
mat = np.array(
[[0]*3]*3,
dtype=[('MSFT','float'),('CSCO','float'),('GOOG','float')]
)
TypeError: expected a readable buffer object
How can I create a matrix easily like
[[None]*M]*N
But with tuples in it to be able to assign names to columns?
Upvotes: 1
Views: 274
Reputation: 231425
When I make an zero array with your dtype
In [548]: dt=np.dtype([('MSFT','float'),('CSCO','float'),('GOOG','float') ])
In [549]: A = np.zeros(3, dtype=dt)
In [550]: A
Out[550]:
array([(0.0, 0.0, 0.0), (0.0, 0.0, 0.0), (0.0, 0.0, 0.0)],
dtype=[('MSFT', '<f8'), ('CSCO', '<f8'), ('GOOG', '<f8')])
notice that the display shows a list of tuples. That's intentional, to distinguish the dtype
records from a row of a 2d (ordinary) array.
That also means that when creating the array, or assigning values, you also need to use a list of tuples.
For example let's make a list of lists:
In [554]: ll = np.arange(9).reshape(3,3).tolist()
In [555]: ll
In [556]: A[:]=ll
...
TypeError: a bytes-like object is required, not 'list'
but if I turn it into a list of tuples:
In [557]: llt = [tuple(i) for i in ll]
In [558]: llt
Out[558]: [(0, 1, 2), (3, 4, 5), (6, 7, 8)]
In [559]: A[:]=llt
In [560]: A
Out[560]:
array([(0.0, 1.0, 2.0), (3.0, 4.0, 5.0), (6.0, 7.0, 8.0)],
dtype=[('MSFT', '<f8'), ('CSCO', '<f8'), ('GOOG', '<f8')])
assignment works fine. That list also can be used directly in array
.
In [561]: np.array(llt, dtype=dt)
Out[561]:
array([(0.0, 1.0, 2.0), (3.0, 4.0, 5.0), (6.0, 7.0, 8.0)],
dtype=[('MSFT', '<f8'), ('CSCO', '<f8'), ('GOOG', '<f8')])
Similarly assigning values to one record requires a tuple, not a list:
In [563]: A[0]=(10,12,14)
The other common way of setting values is on a field by field basis. That can be done with a list or array:
In [564]: A['MSFT']=[100,200,300]
In [565]: A
Out[565]:
array([(100.0, 12.0, 14.0), (200.0, 4.0, 5.0), (300.0, 7.0, 8.0)],
dtype=[('MSFT', '<f8'), ('CSCO', '<f8'), ('GOOG', '<f8')])
The np.rec.fromarrays
method recommended in the other answer ends up using the copy-by-fields approach. It's code is, in essence:
arrayList = [sb.asarray(x) for x in arrayList]
<determine shape>
<determine dtype>
_array = recarray(shape, descr)
# populate the record array (makes a copy)
for i in range(len(arrayList)):
_array[_names[i]] = arrayList[i]
Upvotes: 4
Reputation: 404
If you have a number of 1D arrays (columns) you would like to merge while keeping column names, you can use np.rec.fromarrays
:
>>> dt = np.dtype([('a', float),('b', float),('c', float),])
>>> np.rec.fromarrays([[0] * 3 ] * 3, dtype=dt)
rec.array([(0.0, 0.0, 0.0), (0.0, 0.0, 0.0), (0.0, 0.0, 0.0)], dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])
This gives you a record/structured array in which columns can have names & different datatypes.
Upvotes: 3