Reputation: 1373
I have to replace values from one dataframe with values from another dataframe.
Example bellow works, but I have extra steps in order to replace values in "first" column with values from "new" column and than drop "new" column.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame([['A', 'X'],
...: ['B', 'X'],
...: ['C', 'X'],
...: ['A', 'Y'],
...: ['B', 'Y'],
...: ['C', 'Y'],
...: ], columns=['first', 'second'])
In [3]: df
Out[3]:
first second
0 A X
1 B X
2 C X
3 A Y
4 B Y
5 C Y
In [4]: df_tt = pd.DataFrame([['A', 'E'],
...: ['B', 'F'],
...: ], columns=['orig', 'new'])
In [5]: df_tt
Out[5]:
orig new
0 A E
1 B F
In [6]: df = df.merge(df_tt, left_on='first', right_on='orig')
In [7]: df
Out[7]:
first second orig new
0 A X A E
1 A Y A E
2 B X B F
3 B Y B F
In [8]: df['first'] = df['new']
In [9]: df
Out[9]:
first second orig new
0 E X A E
1 E Y A E
2 F X B F
3 F Y B F
In [10]: df.drop(columns=['orig', 'new'])
Out[10]:
first second
0 E X
1 E Y
2 F X
3 F Y
I would like to replace values with no extra steps.
Upvotes: 2
Views: 917
Reputation: 7476
Another solution is using replace:
# Restrict to common entries
df = df[df['first'].isin(df_tt['orig'])]
# Use df_tt as a mapping to replace values in df
df['first'] = df['first'].replace(df_tt.set_index('orig').to_dict()['new'])
Solution very similar to @jezrael, but I like the idea of explicitly using replace
, because this is actually what you are doing: replacing values in one dataframe based on another dataframe.
Upvotes: 3
Reputation: 862671
Use isin
for filtering with boolean indexing
and then map
:
df = (df[df['first'].isin(df_tt['orig'])]
.assign(first=lambda x: x['first'].map(df_tt.set_index('orig')['new'])))
print (df)
first second
0 E X
1 F X
3 E Y
4 F Y
Alternative:
df = df[df['first'].isin(df_tt['orig'])]
df['first'] = df['first'].map(df_tt.set_index('orig')['new'])
Upvotes: 2