Mapping a new column in a pandas dataframe based on columns in another dataframe which share a column

Question

As it usually goes for pandas questions, it was difficult coming up with a concise enough title so people can know what I am asking about without it being too long, but hopefully it is sufficient. I will provide a simplifed example of what am trying to do, as my current working data set is extremely large.

Consider the following simple dataframe:

df = pd.DataFrame([[0, 5 ],
                   [1, 10],
                   [2, 20],
                   [0, 10],
                   [1, 0 ],
                   [2, 15]], columns = ["ID", "val"])

Which creates the following dataframe df:

I would like to take this dataframe and append a new column which are the mapped values of a column of another dataframe I call df_map, which looks for example like the following:

df_map = pd.DataFrame([[0,10],
                       [1,20],
                       [2,30]], columns = ["ID", "new_val"])

The final dataframe should thus look something like this:

   ID  val  new_val
0   0    5       10
1   1   10       20
2   2   20       30
3   0   10       10
4   1   20       20
5   2   30       30

The mapping function should simply look at the value of the ID Column in df and assign a new column based on the value which is given in df_map

Probably this can be done with a combination of the .groupby and .map functions in pandas, but I have yet to figure out how.

Any tips would be greatly appreciated.

Dani Mesejo · Accepted Answer

You could use Series.map:

df['new_val'] = df.ID.map(df_map.set_index('ID').squeeze())

Output

   ID  val  new_val
0   0    5       10
1   1   10       20
2   2   20       30
3   0   10       10
4   1    0       20
5   2   15       30

Notice that squeeze in df_map.set_index('ID').squeeze() converts the DataFrame into a Series.

Mapping a new column in a pandas dataframe based on columns in another dataframe which share a column

Answers (1)

Related Questions