Kalana
Kalana

Reputation: 6143

How to Subtract rows after group by in Python?

I had a dataframe and after applying groupby().sum, I got this outcome.

What I have

Market     Type      Amount                                   
A          Buy        46.56
B          Buy       100.00
           Sell      100.00
C          Buy       697.20
D          Buy       100.00 <--
           Sell       40.00 <--
...        ...        ...

What I want now

Market     Type      Amount                                   
A          Buy        46.56
C          Buy       697.20
D          Buy        60.00 <--
...        ...         ...

Things to consider

How can I achieve this result?

Upvotes: 2

Views: 124

Answers (3)

ugurtosun
ugurtosun

Reputation: 327

I guess it is not optimized way, but you can try this code block;

idx = df[df["Type"] == 'Sell'].index
df.loc[idx, 'Amount'] = df.loc[idx, 'Amount'] * (-1) 

df = df.groupby(['Market'], as_index=False).agg({'Type': 'min', 'Amount': 'sum'})
df = df.loc[~((df['Amount'] == 0))] 

Upvotes: 1

Luis Blanche
Luis Blanche

Reputation: 627

Here is another way to to it that I think is more "pythonic" : use a pandas.DataFrame.pivot

First reset your index, then apply pivot on 'Type' column, and you get your Buy/Sell amounts in columns, then you can just substract them :

df = df.pivot(index='Market', columns='Type')
        Amount       
Type       Buy   Sell
Market               
A        45.65    NaN
B       100.00  100.0
C       697.20    NaN
D       100.00   40.0

df = pd.DataFrame(df['Amount']['Buy'] - df['Amount']['Sell'].fillna(0), columns=['Buy'])

(Use fillna(0) to make sure you substract 0 when NA)

You can then filter out 0 values

df = df[df['Buy'] > 0 ]

which will get you :

           Buy
Market        
A        45.65
C       697.20
D        60.00

Upvotes: 1

David Meu
David Meu

Reputation: 1545

You can iterate over the df rows:

import pandas as pd
new_data = []
data = [['A', 'Buy', 46.56], ['B', 'Buy', 100.00], ['','Sell', 100],
        ['C', 'Buy', 697.20], ['D', 'Buy', 100.00], ['','Sell', 40]]
df = pd.DataFrame(data, columns=['Market','Type','Amount'])
for index, row in df.iterrows():
    if row['Type'] == 'Buy' and df.iloc[index+1]['Type'] == 'Buy':
        new_data.append([row['Market'], row['Type'],row['Amount']])
    elif row['Type'] == 'Buy' and df.iloc[index + 1]['Type'] == 'Sell':
        if df.iloc[index]['Amount'] - df.iloc[index+1]['Amount']:
            new_data.append([row['Market'], row['Type'],
                             df.iloc[index]['Amount'] - df.iloc[index+1]['Amount']])

print(pd.DataFrame(new_data, columns = ['Market','Type','Amount']))

Outputs:

  Market Type  Amount
0      A  Buy   46.56
1      C  Buy  697.20
2      D  Buy   60.00

Upvotes: 1

Related Questions