Map values using two dataframe columns as references

Question

I have a dataframe, df, where I would like to map certain values to depending on criteria from two column values.

Data

id  date    type
aa  Q1.22   hello
aa  Q2.22   hi
aa  Q4.22   hey 
bb  Q3.23   sure
bb  Q4.23   ok

Desired

id  date    type    status
aa  Q1.22   hello   
aa  Q2.22   hi      2000
aa  Q4.22   hey     
bb  Q3.23   sure    
bb  Q4.23   ok      500

I would like to place a certain value, ex. 2000 whenever the 'id' is aa and the date is Q1.22 I would like to follow this logic.

Doing

df['status']=df['id', 'date'].map({'aa': '2000', 'bb': '500'})

I am using the map function, however, I only know how to map using one column. I am still researching/experimenting with this. Any suggestion is appreciated

SeaBean · Accepted Answer

One way to do it is by defining the mapping dictionary with keys as tuples holding the required row values of the 2 dataframe columns, as follows:

dict1 = {('aa', 'Q2.22'): '2000', ('bb', 'Q4.23'): '500'}

Then, to do the mapping:

df['status'] = df[['id', 'date']].apply(lambda x: dict1.get((x.id, x.date)), axis=1)

or simply:

df['status'] = df.apply(lambda x: dict1.get((x['id'], x['date'])), axis=1)

Result:

print(df)

   id   date   type status
0  aa  Q1.22  hello   None
1  aa  Q2.22     hi   2000
2  aa  Q4.22    hey   None
3  bb  Q3.23   sure   None
4  bb  Q4.23     ok    500

Optionally, if you want to change None to blank, you can use:

df['status'] = df['status'].fillna('')

Result:

print(df)

   id   date   type status
0  aa  Q1.22  hello       
1  aa  Q2.22     hi   2000
2  aa  Q4.22    hey       
3  bb  Q3.23   sure       
4  bb  Q4.23     ok    500

Edit (Alternate way to do the mapping):

There is alternate way of coding the mapping more resembling to the desired style of coding, as follows:

df['status'] = df[['id', 'date']].apply(tuple, axis=1).map(dict1)

or put the mapping dictionary dict1 inline, as follows:

df['status'] = df[['id', 'date']].apply(tuple, axis=1).map({('aa', 'Q2.22'): '2000', ('bb', 'Q4.23'): '500'})

The last one is the most resembling to the desired style of coding.

Map values using two dataframe columns as references

Answers (2)

Edit (Alternate way to do the mapping):

Related Questions