How can I concatenate arrays of tuples created with dtype?

Question

I am trying to concatenate two arrays created in this form:

data1 = numpy.array([(1,2,3),(4,5,6)], dtype={'names':['a', 'b', 'c'], 'formats':[int, float, float]})
data2 = numpy.array([(11,22),(44,55)], dtype={'names':['d', 'e'], 'formats':[int, float]})

I want to end up with something in this form:

array([(1, 2., 3., 11, 22.), (4, 5., 6., 44, 55.)],
      dtype=[('a', '



How can I do this?

This code almost gets me there, but I cannot figure out how to concatenate the dtypes:

  m = []
  for i,j in zip(data1, data2):
    print(i,j)
    m.append( (*i,*j) )


Extra question:

Are these kind of manipulations easier with pandas?

I basically want arrays with named fields and type, that I can easily plot, output to CSV files (with header) and to which I can easily add extra columns and rows if needed. (ex: compute a new column from other columns, add rows from another dataset)

I am willing to change my code to make the arrays in a different way if needed, but would still like to know how to deal with these dtype arrays...

Ehsan · Accepted Answer

You can use numpy.lib.recfunctions as suggested in comments:

import numpy.lib.recfunctions as rfn

arrays = [data1, data2]

m = rfn.merge_arrays(arrays, flatten = True)

Output:

m:

[(1, 2., 3., 11, 22.) (4, 5., 6., 44, 55.)]

m.dtype:

[('a', '



UPDATE: In case data1 and data2 have common fields:  

common_fields = set(data1.dtype.names).intersection(set(data2.dtype.names))

#remove common fields from data1 (simply change to data2 if you wish to remove from data2)
data1 = data1[[name for name in data1.dtype.names if name not in common_fields]]

arrays = [data1, data2]
m = rfn.merge_arrays(arrays, flatten = True)

How can I concatenate arrays of tuples created with dtype?

Answers (1)

Related Questions