Rei
Rei

Reputation: 329

Subtract values of duplicate rows pandas

I have the following dataframe :

Date         Code     Name             Days
23/09/2019   WO       Smith Thomas     1
23/09/2019   RE       Johnson Robert   1
24/09/2019   WO       Smith Thomas     1
24/09/2019   NW       Smith Thomas     0.5
25/09/2019   OP       Johnson Robert   0.5
25/09/2019   GF       Johnson Robert   0.5

In case of duplicates based on the columns Date and Name, I would like to check if the sum of days is > 1. If yes then I would like to substrate the biggest value with the other but don't delete any rows, otherwise do nothing.

The desired output would be :

Date         Code     Name             Days
23/09/2019   WO       Smith Thomas     1
23/09/2019   RE       Johnson Robert   1
24/09/2019   WO       Smith Thomas     0.5
24/09/2019   NW       Smith Thomas     0.5
25/09/2019   OP       Johnson Robert   0.5
25/09/2019   GF       Johnson Robert   0.5

Is there any Pandas way to do it?

I could loop using Python but it doesn't seems efficient

Thanks

Upvotes: 0

Views: 726

Answers (1)

user7440787
user7440787

Reputation: 841

without more details on what the numerical operation you want to perform, the only way to get the result you want is the following:

df.Days = df.groupby(['Date', 'Name']).Days.transform(
    lambda x: x if len(x) == 1 else 
    [max(i - min(x), min(x)) if sum(x) > 1 else i for i in x])

The reasoning behind is:

  1. group all the rows on a Date, Name basis
  2. look at the Days column and checks if there are duplicates
  3. If that's the case it will then subtract all values by the lowest and keep take the maximum between the subtraction result of the lowest value in the list (this is a cheap way to prevent subtracting the lowest to itself)
    Date    Code    Name    Days
0   23/09/2019  WO  Smith Thomas    1.0
1   23/09/2019  RE  Johnson Robert  1.0
2   24/09/2019  WO  Smith Thomas    0.5
3   24/09/2019  NW  Smith Thomas    0.5
4   25/09/2019  OP  Johnson Robert  0.5
5   25/09/2019  GF  Johnson Robert  0.5

Upvotes: 1

Related Questions