Reputation: 121
Suppose I have two arrays (after import numpy as np),
a=np.array([['a',1],['b',2]],dtype=object)
and
b=np.array([['b',3],['c',4]],dtype=object)
How do I get:
c=np.array([['a',1,None],['b',2,3],['c',None,4]],dtype=object)
Basically, an join using the first column as key.
Thanks
Upvotes: 5
Views: 5534
Reputation: 121
The best solution I found is using pandas, which handles joins very well, and pandas objects convert to/from numpy arrays easily.
Upvotes: 2
Reputation: 601669
A pure Python approach to do this would be
da = dict(a)
db = dict(b)
c = np.array([(k, da.get(k), db.get(k))
for k in set(da.iterkeys()).union(db.iterkeys())])
But if you are using NumPy, your arrays are probably big, and you are looking for a solution with a better performance. In this case, I suggest using some real database to do this, for example the sqlite3
module that comes with Python.
Upvotes: 7