Babyburger
Babyburger

Reputation: 1830

Joining rows based on value conditions

Assume a Pandas DataFrame like the following:

Year         Bread     Amount
-----------------------------
2008S1       white        50
2008S1       brown        30
2008S2       white        60
2008S2       brown        90
2009S1       brown       100
2009S1      golden      2000
2009S2       brown        20
2009S2      golden      1000

I wish to join the relevant year/bread combo together such that the amount is added together. It's okay to assume that every year+bread combo is unique. So the result would become:

Year         Bread     Amount
-----------------------------
2008         white       110
2008         brown       120
2009         brown       120
2009        golden      3000

What's a good way to do this? I thought of using boolean indexing to splice the rows containing yearS2 and bread (deleting the rows and extracting the values). Then look for the relevant rows (yearS1 and same bread) again through boolean indexing, to perform the addition. This sounds like a lot of work which I assume can be handled more elegantly.

Upvotes: 3

Views: 57

Answers (2)

Binyamin Even
Binyamin Even

Reputation: 3382

assuming your DataFrame is in df

df.groupby(['year','bread'])['amount'].sum().reset_index()

Upvotes: -1

user2285236
user2285236

Reputation:

You can use groupby. Grouping will be done on the first 4 characters of the Year column and the Bread column as follows:

df.groupby([df['Year'].str[:4], 'Bread']).sum()
Out: 
             Amount
Year Bread         
2008 brown      120
     white      110
2009 brown      120
     golden    3000

Or as regular columns:

df.groupby([df['Year'].str[:4], 'Bread'], as_index=False).sum()
Out: 
    Bread  Amount
0   brown     120
1   white     110
2   brown     120
3  golden    3000

Upvotes: 4

Related Questions