Reputation: 495
I want to replace values in the df dataframe using the lookup dataframe.
import pandas as pd
df=pd.DataFrame({
'no1':[20,20,40,10,50],
'no2':[50,20,10,40,50],
'no3':[30,10,50,40,50]
})
no1 no2 no3
0 20 50 30
1 20 20 10
2 40 10 50
3 10 40 40
4 50 50 50
lookup=pd.DataFrame({'label':['A','B','C','D','E'],
'id':[10,20,30,40,50]})
label id
0 A 10
1 B 20
2 C 30
3 D 40
4 E 50
Particularly, I'd like to have:
no1 no2 no3
0 B E C
1 B B A
2 D A E
3 A D D
4 E E E
What is the best way doing it using pandas?
P.S.: I found a very similar question herein, but I do not quite follow as it is in R. A Python solution is appreciated.
Upvotes: 6
Views: 5878
Reputation: 164693
You can construct a dictionary and then use np.vectorize
:
d = lookup.set_index('id')['label'].to_dict() # or d = dict(np.fliplr(lookup.values))
df.iloc[:] = np.vectorize(d.get)(df.values)
print(df)
no1 no2 no3
0 B E C
1 B B A
2 D A E
3 A D D
4 E E E
Upvotes: 5
Reputation: 862771
First create Series
by set_index
.
Use replace
, but it should be slow in large DataFrame
:
s = lookup.set_index('id')['label']
df = df.replace(s)
Solutions for None
or NaN
s for non matched values with applymap
or apply
with map
:
df = df.applymap(s.get)
Or:
df = df.apply(lambda x: x.map(s))
Or:
for c in df.columns:
df[c] = df[c].map(s)
print (df)
no1 no2 no3
0 B E C
1 B B A
2 D A E
3 A D D
4 E E E
Upvotes: 1
Reputation: 61910
You could use replace with a dictionary:
import pandas as pd
df=pd.DataFrame({
'no1':[20,20,40,10,50],
'no2':[50,20,10,40,50],
'no3':[30,10,50,40,50]
})
lookup=pd.DataFrame({'label':['A','B','C','D','E'],
'id':[10,20,30,40,50]})
result = df.replace(dict(zip(lookup.id, lookup.label)))
print(result)
Output
no1 no2 no3
0 B E C
1 B B A
2 D A E
3 A D D
4 E E E
Upvotes: 8