Reputation: 1998
I have df below as:
id | name | status | country | ref_id
3 Bob False Germany NaN
5 422 True USA 3
7 Nick False India NaN
6 Chris True Australia 7
8 324 True Africa 28
28 Tim False Canada 53
I want to add a new column for each row, where if the status for that row is True, if the ref_id for that row exists in the id column in another row and that rows status is False, give me the value of the name in that column.
So expected output below would be:
id | name | status | country | ref_id | new
3 Bob False Germany NaN NaN
5 422 True USA 3 Bob
7 Nick False India NaN NaN
6 Chris True Australia 7 Nick
8 324 True Africa 28 Tim
28 Tim False Canada 53 NaN
I have code below that I am using for other purposes that just filters for rows that have a status of True, and and id_reference value that exists in the id column like below:
(df.loc[df["status"]&df["id_reference"].astype(float).isin(df.loc[~df["status"], "id"])])
But I am trying to also calculate a new column as mentioned prior above with the value of the name if it has one in that column
Thanks!
Upvotes: 1
Views: 38
Reputation: 150735
This essentially a merge:
merged = (df.loc[df['status'],['ref_id']]
.merge(df.loc[~df['status'],['id','name']], left_on='ref_id', right_on='id')
)
df['ref_id'] = (df['id'].map(merged.set_index('id')['name'])
.where(df['status'])
)
Upvotes: 1
Reputation: 323226
Let us try
df['new']=df.loc[df.status,'ref_id'].map(df.set_index('id')['name'])
df
id name status country ref_id new
0 3 Bob False Germany NaN NaN
1 5 422 True USA 3.0 Bob
2 7 Nick False India NaN NaN
3 6 Chris True Australia 7.0 Nick
4 8 324 True Africa 28.0 Tim
5 28 Tim False Canada 53.0 NaN
Upvotes: 2