Lynn
Lynn

Reputation: 4408

Map values using two dataframe columns as references

I have a dataframe, df, where I would like to map certain values to depending on criteria from two column values.

Data

id  date    type
aa  Q1.22   hello
aa  Q2.22   hi
aa  Q4.22   hey 
bb  Q3.23   sure
bb  Q4.23   ok

Desired

id  date    type    status
aa  Q1.22   hello   
aa  Q2.22   hi      2000
aa  Q4.22   hey     
bb  Q3.23   sure    
bb  Q4.23   ok      500

I would like to place a certain value, ex. 2000 whenever the 'id' is aa and the date is Q1.22 I would like to follow this logic.

Doing

df['status']=df['id', 'date'].map({'aa': '2000', 'bb': '500'})

I am using the map function, however, I only know how to map using one column. I am still researching/experimenting with this. Any suggestion is appreciated

Upvotes: 1

Views: 1018

Answers (2)

SeaBean
SeaBean

Reputation: 23217

One way to do it is by defining the mapping dictionary with keys as tuples holding the required row values of the 2 dataframe columns, as follows:

dict1 = {('aa', 'Q2.22'): '2000', ('bb', 'Q4.23'): '500'}

Then, to do the mapping:

df['status'] = df[['id', 'date']].apply(lambda x: dict1.get((x.id, x.date)), axis=1)

or simply:

df['status'] = df.apply(lambda x: dict1.get((x['id'], x['date'])), axis=1)

Result:

print(df)

   id   date   type status
0  aa  Q1.22  hello   None
1  aa  Q2.22     hi   2000
2  aa  Q4.22    hey   None
3  bb  Q3.23   sure   None
4  bb  Q4.23     ok    500

Optionally, if you want to change None to blank, you can use:

df['status'] = df['status'].fillna('')

Result:

print(df)

   id   date   type status
0  aa  Q1.22  hello       
1  aa  Q2.22     hi   2000
2  aa  Q4.22    hey       
3  bb  Q3.23   sure       
4  bb  Q4.23     ok    500

Edit (Alternate way to do the mapping):

There is alternate way of coding the mapping more resembling to the desired style of coding, as follows:

df['status'] = df[['id', 'date']].apply(tuple, axis=1).map(dict1)

or put the mapping dictionary dict1 inline, as follows:

df['status'] = df[['id', 'date']].apply(tuple, axis=1).map({('aa', 'Q2.22'): '2000', ('bb', 'Q4.23'): '500'})

The last one is the most resembling to the desired style of coding.

Upvotes: 1

Henry Ecker
Henry Ecker

Reputation: 35676

Since the mapping structure is being manually assembled, I'd recommend a DataFrame.

mapping_df:

mapping_df = pd.DataFrame([
    {'id': 'aa', 'date': 'Q2.22', 'status': 2000},
    {'id': 'bb', 'date': 'Q4.23', 'status': 500}
])
   id   date  status
0  aa  Q2.22    2000
1  bb  Q4.23     500

Then merge

df = df.merge(mapping_df, how='left')

df:

   id   date   type  status
0  aa  Q1.22  hello     NaN
1  aa  Q2.22     hi  2000.0
2  aa  Q4.22    hey     NaN
3  bb  Q3.23   sure     NaN
4  bb  Q4.23     ok   500.0

Optional fillna

df = df.merge(mapping_df, how='left')
df['status'] = df['status'].fillna('')
   id   date   type  status
0  aa  Q1.22  hello        
1  aa  Q2.22     hi  2000.0
2  aa  Q4.22    hey        
3  bb  Q3.23   sure        
4  bb  Q4.23     ok   500.0

Upvotes: 1

Related Questions