송준석
송준석

Reputation: 1031

how can i sum dataframe columns on other columns in python?

I have the following data frame.

ID    Product  quantity
9626     a       1
9626     b       1
9626     c       1
6600     f       1
6600     a       1
6600     d       1

And I want to join rows by ID. Below is an example of the results. (The quantity column is optional. This column is not necessary.)

ID    Product  quantity
9626     a,b,c     3
6600     a,d,f     3

I used merge and sum, but it did not work.

Is this problem solved only with a loop statement?

I'd appreciate it if you could provide me with a solution.

Upvotes: 0

Views: 63

Answers (1)

Space Impact
Space Impact

Reputation: 13255

Use groupby.agg:

df = (df.sort_values('Product')
        .groupby('ID', as_index=False, sort=False)
        .agg({'Product':','.join, 'quantity':'sum'}))

print(df)
     ID Product  quantity
0  9626   a,b,c         3
1  6600   a,d,f         3

Upvotes: 5

Related Questions