Reputation: 50
I have a dataframe like below
stock__stock__FINCODE quantity price transcation_type month
0 100112 1000.0 2000.0 buy 2022-06-01
1 100112 500.0 2000.0 buy 2022-06-01
2 100112 1200.0 3000.0 sell 2022-08-01
3 100112 100.0 3000.0 sell 2022-08-01
4 100112 200.0 3000.0 sell 2022-08-01
5 100325 100.0 1500.0 buy 2022-06-01
I did groupby with stock__stock__FINCODE
, transcation_type
and month
, and made sum of quantity
. The code and result is as below
cols = ['stock__stock__FINCODE', 'month', 'transcation_type']
df2 = df.groupby(cols)[['quantity']].sum().add_suffix('_sum')
quantity_sum
stock__stock__FINCODE month transcation_type
100112 2022-06-01 buy 1500.0
2022-08-01 sell 1500.0
100325 2022-06-01 buy 100.0
but I want the result on each month and if a sell happens in same month then need to substract with the buy. If any month is not available then add that month and make the quantity same as previous month.
the result should be some thing like below.
stock__stock__FINCODE 2022-06-01 2022-07-01 2022-08-01
100112 1500 1500 0
100325 100 100 100
the date column is basically monthly quantity. It should start from the smallest date to till now.
can someone please help me with this?
Upvotes: 0
Views: 61
Reputation: 1082
Here you go. You will need to create a daterange index
date_range_index = pd.date_range(start=df.month.min(),end=df.month.max(), freq='MS')
date_range_index_df = pd.DataFrame(index=date_range_index)
Unstack df2 to create 2 columns one for buy one for sell
cols = ['stock__stock__FINCODE', 'month', 'transcation_type']
df2 = df.groupby(cols)[['quantity']].sum().add_suffix('_sum')
df2_unstack = df2.unstack(level='transcation_type').reset_index()
df2_unstack.columns = df2_unstack.columns.map('_'.join)
df2_unstack = df2_unstack.set_index(pd.to_datetime(df2_unstack.month_)).fillna(0)
df2_unstack
Use pandas reindex
to create a function and get the cumulative
#df2.sort_index()
def apply_on_each_stock(gdf: pd.DataFrame):
gdf_up = gdf.reindex(date_range_index,fill_value=0).sort_index(level='month_')
gdf_up['cumbuy'] = gdf_up['quantity_sum_buy'].cumsum()
gdf_up['cumsell'] = gdf_up['quantity_sum_sell'].cumsum()
gdf_up['qty_rem'] = gdf_up['cumbuy'] - gdf_up['cumsell']
return(gdf_up)
df2_ops = df2_unstack.groupby(['stock__stock__FINCODE_']).apply(
lambda row: apply_on_each_stock(row)
)
And then you reshape as you want
df2_ops['qty_rem'].unstack(level=1)
2022-06-01 2022-07-01 2022-08-01
stock__stock__FINCODE_
100112 1500.0 1500.0 0.0
100325 100.0 100.0 100.0
Upvotes: 1