Reputation: 87
As it usually goes for pandas questions, it was difficult coming up with a concise enough title so people can know what I am asking about without it being too long, but hopefully it is sufficient. I will provide a simplifed example of what am trying to do, as my current working data set is extremely large.
Consider the following simple dataframe:
df = pd.DataFrame([[0, 5 ],
[1, 10],
[2, 20],
[0, 10],
[1, 0 ],
[2, 15]], columns = ["ID", "val"])
Which creates the following dataframe df
:
ID val
0 0 5
1 1 10
2 2 20
3 0 10
4 1 0
5 2 15
I would like to take this dataframe and append a new column which are the mapped values of a column of another dataframe I call df_map
, which looks for example like the following:
df_map = pd.DataFrame([[0,10],
[1,20],
[2,30]], columns = ["ID", "new_val"])
The final dataframe should thus look something like this:
ID val new_val
0 0 5 10
1 1 10 20
2 2 20 30
3 0 10 10
4 1 20 20
5 2 30 30
The mapping function should simply look at the value of the ID Column in df
and assign a new column based on the value which is given in df_map
Probably this can be done with a combination of the .groupby
and .map
functions in pandas, but I have yet to figure out how.
Any tips would be greatly appreciated.
Upvotes: 1
Views: 1870
Reputation: 61920
You could use Series.map:
df['new_val'] = df.ID.map(df_map.set_index('ID').squeeze())
Output
ID val new_val
0 0 5 10
1 1 10 20
2 2 20 30
3 0 10 10
4 1 0 20
5 2 15 30
Notice that squeeze in df_map.set_index('ID').squeeze()
converts the DataFrame into a Series.
Upvotes: 2