Reputation: 21274
I have a dataframe with fields ['Transaction Description', 'Transaction Date', 'Debit Amount']
. I want to sum the Debit Amount column by month and by Transaction Description to see how much I am spending on different things each month.
I tried
df.groupby(['Transaction Description', 'Transaction Date'])['Debit Amount'].sum().sort_values(ascending=False).head(180)
which gives me a sum for each Transcripton Description by day.
Alternatively,
df.groupby(['Transaction Description'])['Debit Amount'].sum().sort_values(ascending=False).head(180)
gives me what I want but for the whole dataframe, not split by month.
I would like the output to have the months in order and for each one show the total amount spent on each Transaction Description, sorted from largest to smallest. This is so I can look at a given month and see what I have been spending my money on.
Here is a sample from the dataframe
{'Transaction Date': {0: Timestamp('2022-05-04 00:00:00'),
1: Timestamp('2022-05-04 00:00:00'),
2: Timestamp('2022-04-04 00:00:00'),
3: Timestamp('2022-04-04 00:00:00'),
4: Timestamp('2022-04-04 00:00:00'),
5: Timestamp('2022-04-04 00:00:00'),
6: Timestamp('2022-04-04 00:00:00'),
7: Timestamp('2022-04-04 00:00:00'),
8: Timestamp('2022-04-04 00:00:00'),
9: Timestamp('2022-01-04 00:00:00')},
'Transaction Description': {0: 'School',
1: 'Cleaner',
2: 'Taxi',
3: 'shop',
4: 'MOBILE',
5: 'Restaurant',
6: 'Restaurant',
7: 'shop',
8: 'Taxi',
9: 'shop'},
'Debit Amount': {0: 15.0,
1: 26.0,
2: 48.48,
3: 9.18,
4: 7.0,
5: 10.05,
6: 9.1,
7: 2.14,
8: 16.0,
9: 11.68}
In this case I would like the output to be something like:
2022-01 shop 11.68
2022-04 Taxi 64.48
shop 23.00
MOBILE 7.00
Restaurant 19.15
2022-05 School 15.00
Cleaner 26.00
Upvotes: 0
Views: 521
Reputation: 11650
Try this out, if you had shared the MRE, i would have validated and shared the resultset
# create a ym column and use that in your groupby
(df.assign(ym=pd.to_datetime(df['Transaction Date']).dt.strftime('%Y-%m'))
.groupby(['ym','Transaction Description' ] )['Debit Amount'].sum()
)
ym Transaction Description
2022-01 shop 11.68
2022-04 MOBILE 7.00
Restaurant 19.15
Taxi 64.48
shop 11.32
2022-05 Cleaner 26.00
School 15.00
Name: Debit Amount, dtype: float64
OR
(df.assign(ym=pd.to_datetime(df['Transaction Date']).dt.strftime('%Y-%m'))
.groupby(['ym','Transaction Description' ], as_index=False )['Debit Amount'].sum()
)
ym Transaction Description Debit Amount
0 2022-01 shop 11.68
1 2022-04 MOBILE 7.00
2 2022-04 Restaurant 19.15
3 2022-04 Taxi 64.48
4 2022-04 shop 11.32
5 2022-05 Cleaner 26.00
6 2022-05 School 15.00
Upvotes: 1
Reputation: 93151
Use pd.Grouper
. I assume your Transaction Date
is of type date:
df.groupby([pd.Grouper(key="Transaction Date", freq="MS"), "Transaction Description"]).sum()
Upvotes: 2