Reputation: 560
I have two dataframes:
df = pd.DataFrame([['A', 'B'], ['B', 'A']], columns=['Mon', 'Tues'])
Mon Tues
0 A B
1 B A
lookup = pd.DataFrame([[0, 1], [2, 3]], index=['A', 'B'], columns=df.columns)
Mon Tues
A 0 1
B 2 3
For each day, for each key in the first dataframe, I would like to look up its value in the second dataframe. This is one way to do it:
output = pd.DataFrame()
for col in df.columns:
output[col] = df[col].map(lookup[col])
Mon Tues
0 0 3
1 2 1
Is there a way to get the same answer without the explicit loop?
Upvotes: 2
Views: 191
Reputation: 176938
You can use replace
to exchange the values:
>>> df.replace(lookup)
Mon Tues
0 0 3
1 2 1
When passed to the replace
method, DataFrames are treated similarly to nested dictionaries. For each column label, the column's values are mapped to the correspondingly indexed value in the lookup DataFrame.
Upvotes: 2
Reputation: 221624
I am assuming that each column data in df
is a random order of the lookup.index
and that lookup.index
is in sorted order. So, going with it, here's a NumPy based vectorized solution, specifically using its advanced-indexing
-
idx = np.argsort(df.values,1)
df_out = pd.DataFrame(lookup.values[idx,np.arange(idx.shape[1])],columns=df.columns)
Sample output -
In [41]: idx = np.argsort(df.values,1)
In [42]: pd.DataFrame(lookup.values[idx,np.arange(idx.shape[1])],columns=df.columns)
Out[42]:
Mon Tues
0 0 3
1 2 1
If lookup.index
is not in sorted order, we need to do some additional work -
sidx = lookup.index.argsort()
arr_out = lookup.values[idx,np.arange(idx.shape[1])][sidx]
Upvotes: 2