Reputation: 143
Apologies if the terminology is incorrect.
My Dataframe has 4 columns. Col1 is the personId (will hold multiples of the same value) Col2 is the event type Col3 is Boolean Col4 is Boolean
Only col3 OR col4 can be true at any one time.
Input:
personId EventType Col3 Col4
1 A True False
2 A True False
3 A True False
2 B False True
4 A True False
3 B False True
Desired output:
personId EventType Col3 Col4 Col5
1 A True False False
2 A True False True
3 A True False True
2 B False True False
4 A True False False
3 B False True False
4 C False True False
As you can see, Col5 will be true where the Trues from Col3 relate to a personId which has also had EventType B, but not C.
I am utterly stuck. I am sure I am overlooking something extremely simple.
I have tried the following:
df = pd.merge(df, left_on='personId', right_on='personId')
df = pd.merge(df, df[['personId','Col4']], on='personId', how='left)
but neither achieve the desired results.
Upvotes: 2
Views: 882
Reputation: 153500
Better logic:
df['Col5'] = df.groupby('personId')['EventType'].transform(lambda x: (x=='B').any()) & df['Col3']
Output:
personId EventType Col3 Col4 Col5
0 1 A True False False
1 2 A True False True
2 3 A True False True
3 2 B False True False
4 4 A True False False
5 3 B False True False
Details: Use groupby personid and transform
EventType column to True if any Event in that personid equals to 'B', then use boolean logic operator and, &
with 'col3'.
IIUC, you try using groupby and some boolean logic:
df.join(df.groupby('personId', group_keys=False)
.apply(lambda x: (x['EventType']=='B').any() & x['Col3'])
.rename('Col5'))
Output:
personId EventType Col3 Col4 Col5
0 1 A True False False
1 2 A True False True
2 3 A True False True
3 2 B False True False
4 4 A True False False
5 3 B False True False
Upvotes: 3
Reputation: 2750
You don't need to join, you need to set column 5 to be the XOR of column3 and column4
df['Col5'] = df['Col4'] ^ df['Col3']
Upvotes: 1