Reputation: 6143
I had a dataframe
and after applying groupby().sum
, I got this outcome.
What I have
Market Type Amount
A Buy 46.56
B Buy 100.00
Sell 100.00
C Buy 697.20
D Buy 100.00 <--
Sell 40.00 <--
... ... ...
What I want now
Market Type Amount
A Buy 46.56
C Buy 697.20
D Buy 60.00 <--
... ... ...
Things to consider
B
should remove from the dataframe
because 100.00 - 100.00 = 0
Buy Amount > Sell Amount
How can I achieve this result?
Upvotes: 2
Views: 124
Reputation: 327
I guess it is not optimized way, but you can try this code block;
idx = df[df["Type"] == 'Sell'].index
df.loc[idx, 'Amount'] = df.loc[idx, 'Amount'] * (-1)
df = df.groupby(['Market'], as_index=False).agg({'Type': 'min', 'Amount': 'sum'})
df = df.loc[~((df['Amount'] == 0))]
Upvotes: 1
Reputation: 627
Here is another way to to it that I think is more "pythonic" : use a pandas.DataFrame.pivot
First reset your index, then apply pivot on 'Type' column, and you get your Buy/Sell amounts in columns, then you can just substract them :
df = df.pivot(index='Market', columns='Type')
Amount
Type Buy Sell
Market
A 45.65 NaN
B 100.00 100.0
C 697.20 NaN
D 100.00 40.0
df = pd.DataFrame(df['Amount']['Buy'] - df['Amount']['Sell'].fillna(0), columns=['Buy'])
(Use fillna(0)
to make sure you substract 0 when NA)
You can then filter out 0 values
df = df[df['Buy'] > 0 ]
which will get you :
Buy
Market
A 45.65
C 697.20
D 60.00
Upvotes: 1
Reputation: 1545
You can iterate over the df rows:
import pandas as pd
new_data = []
data = [['A', 'Buy', 46.56], ['B', 'Buy', 100.00], ['','Sell', 100],
['C', 'Buy', 697.20], ['D', 'Buy', 100.00], ['','Sell', 40]]
df = pd.DataFrame(data, columns=['Market','Type','Amount'])
for index, row in df.iterrows():
if row['Type'] == 'Buy' and df.iloc[index+1]['Type'] == 'Buy':
new_data.append([row['Market'], row['Type'],row['Amount']])
elif row['Type'] == 'Buy' and df.iloc[index + 1]['Type'] == 'Sell':
if df.iloc[index]['Amount'] - df.iloc[index+1]['Amount']:
new_data.append([row['Market'], row['Type'],
df.iloc[index]['Amount'] - df.iloc[index+1]['Amount']])
print(pd.DataFrame(new_data, columns = ['Market','Type','Amount']))
Outputs:
Market Type Amount
0 A Buy 46.56
1 C Buy 697.20
2 D Buy 60.00
Upvotes: 1