Create new column in a dataframe based on group by and other conditions

Question

I have a data frame where you can have either one row or two rows by id.

I need to create two new columns where for each id group if we have one row only. We determine the values based on the condition on one column 'type'. If there are two rows, the values of the new columns depend on the conditions on 'type' and get values from the other row of the group.

An example will speak better.

We have at first this data frame:

index	id	type
0	1	d
1	2	pu
2	3	pu
3	3	d
4	4	pu
5	4	d

We expect that output:

if 1 row per 'id':
   if type == 'd':
      didx = -1
      puidx = -10
   elif type == 'pu':
      didx = -10
      puidx = -1
elif 2 rows per 'id':
   if type == 'd':
      didx = -1
      puidx = index of the other row with same 'id'
   elif type == 'pu':
      didx = index of the other row with same 'id'
      puidx = -1

index	id	type	didx	puidx
0	1	d	-1	-10
1	2	pu	-10	-1
2	3	pu	3	-1
3	3	d	-1	2
4	4	pu	5	-1
5	4	d	-1	4

I assume that we need to use groupby and apply. But not sure how to make it efficient. Thanks for your help.

Create new column in a dataframe based on group by and other conditions

Answers (1)

Related Questions