CRP
CRP

Reputation: 121

In Python, how do I join two arrays by key column?

Suppose I have two arrays (after import numpy as np),

a=np.array([['a',1],['b',2]],dtype=object)

and

b=np.array([['b',3],['c',4]],dtype=object)

How do I get:

c=np.array([['a',1,None],['b',2,3],['c',None,4]],dtype=object)

Basically, an join using the first column as key.

Thanks

Upvotes: 5

Views: 5534

Answers (2)

CRP
CRP

Reputation: 121

The best solution I found is using pandas, which handles joins very well, and pandas objects convert to/from numpy arrays easily.

Upvotes: 2

Sven Marnach
Sven Marnach

Reputation: 601669

A pure Python approach to do this would be

da = dict(a)
db = dict(b)
c = np.array([(k, da.get(k), db.get(k))
              for k in set(da.iterkeys()).union(db.iterkeys())])

But if you are using NumPy, your arrays are probably big, and you are looking for a solution with a better performance. In this case, I suggest using some real database to do this, for example the sqlite3 module that comes with Python.

Upvotes: 7

Related Questions