Pandas check if the sum of two columns within the same group equals

Question

I have a purchase record table showing the purchasing history recorded by customers and the store. I would like to get a new column where it shows the sum of the customer records & store records from each customer, as a group, equals. So basically 5 + 3 = 5 + 3 so Yes for 001 but 2 != 3 so No for 002.

Date	custID	itmID	cust rec	store rec
0514	001	A	5	5
0514	001	B	3	3
0514	002	A	2	3

expected result:

Date	custID	itmID	cust rec	store rec	sum match
0514	001	A	5	5	Yes
0514	001	B	3	3	Yes
0514	002	A	2	3	No

my code:

df['sum match'] = (sum(df['store rec'])==sum(df['cust rec'])).groupby(match.custID)

error shows 'bool' object has no attribute 'groupby'

anky · Accepted Answer

You can groupby and transform sum and then check difference of the columns on axis=1. then get the last result to check if it is 0 and conditionally assign values to the new column:

c = (df.groupby("custID")[['cust rec','store rec']]
      .transform('sum').diff(axis=1).iloc[:,-1].eq(0))

df['sum match'] = np.where(c,"Yes",'No')

output:

print(df)

   Date  custID itmID  cust rec  store rec sum match
0   514       1     A         5          5       Yes
1   514       1     B         3          3       Yes
2   514       2     A         2          3        No

More details of the steps:

Groupby + transform sum returns sum of each group per column:

print(df.groupby("custID")[['cust rec','store rec']].transform('sum'))

   cust rec  store rec
0         8          8
1         8          8
2         2          3

then difference on axis=1:

print(df.groupby("custID")[['cust rec','store rec']].transform('sum').diff(axis=1))
   cust rec  store rec
0       NaN        0.0
1       NaN        0.0
2       NaN        1.0

Pandas check if the sum of two columns within the same group equals

Answers (1)

Related Questions