Reputation: 522
I am trying to concatenate two arrays created in this form:
data1 = numpy.array([(1,2,3),(4,5,6)], dtype={'names':['a', 'b', 'c'], 'formats':[int, float, float]})
data2 = numpy.array([(11,22),(44,55)], dtype={'names':['d', 'e'], 'formats':[int, float]})
I want to end up with something in this form:
array([(1, 2., 3., 11, 22.), (4, 5., 6., 44, 55.)],
dtype=[('a', '<i8'), ('b', '<f8'), ('c', '<f8'), ('d', '<i8'), ('e', '<f8')])
How can I do this?
This code almost gets me there, but I cannot figure out how to concatenate the dtypes:
m = []
for i,j in zip(data1, data2):
print(i,j)
m.append( (*i,*j) )
Extra question:
Are these kind of manipulations easier with pandas?
I basically want arrays with named fields and type, that I can easily plot, output to CSV files (with header) and to which I can easily add extra columns and rows if needed. (ex: compute a new column from other columns, add rows from another dataset)
I am willing to change my code to make the arrays in a different way if needed, but would still like to know how to deal with these dtype arrays...
Upvotes: 1
Views: 133
Reputation: 12407
You can use numpy.lib.recfunctions
as suggested in comments:
import numpy.lib.recfunctions as rfn
arrays = [data1, data2]
m = rfn.merge_arrays(arrays, flatten = True)
Output:
m:
[(1, 2., 3., 11, 22.) (4, 5., 6., 44, 55.)]
m.dtype:
[('a', '<i8'), ('b', '<f8'), ('c', '<f8'), ('d', '<i8'), ('e', '<f8')]
UPDATE: In case data1 and data2 have common fields:
common_fields = set(data1.dtype.names).intersection(set(data2.dtype.names))
#remove common fields from data1 (simply change to data2 if you wish to remove from data2)
data1 = data1[[name for name in data1.dtype.names if name not in common_fields]]
arrays = [data1, data2]
m = rfn.merge_arrays(arrays, flatten = True)
Upvotes: 1