Simd
Simd

Reputation: 21274

How to aggregate by month in pandas

I have a dataframe with fields ['Transaction Description', 'Transaction Date', 'Debit Amount']. I want to sum the Debit Amount column by month and by Transaction Description to see how much I am spending on different things each month.

I tried

df.groupby(['Transaction Description', 'Transaction Date'])['Debit Amount'].sum().sort_values(ascending=False).head(180)

which gives me a sum for each Transcripton Description by day.

Alternatively,

df.groupby(['Transaction Description'])['Debit Amount'].sum().sort_values(ascending=False).head(180)

gives me what I want but for the whole dataframe, not split by month.


I would like the output to have the months in order and for each one show the total amount spent on each Transaction Description, sorted from largest to smallest. This is so I can look at a given month and see what I have been spending my money on.

Here is a sample from the dataframe

{'Transaction Date': {0: Timestamp('2022-05-04 00:00:00'),
  1: Timestamp('2022-05-04 00:00:00'),
  2: Timestamp('2022-04-04 00:00:00'),
  3: Timestamp('2022-04-04 00:00:00'),
  4: Timestamp('2022-04-04 00:00:00'),
  5: Timestamp('2022-04-04 00:00:00'),
  6: Timestamp('2022-04-04 00:00:00'),
  7: Timestamp('2022-04-04 00:00:00'),
  8: Timestamp('2022-04-04 00:00:00'),
  9: Timestamp('2022-01-04 00:00:00')},
 'Transaction Description': {0: 'School',
  1: 'Cleaner',
  2: 'Taxi',
  3: 'shop',
  4: 'MOBILE',
  5: 'Restaurant',
  6: 'Restaurant',
  7: 'shop',
  8: 'Taxi',
  9: 'shop'},
 'Debit Amount': {0: 15.0,
  1: 26.0,
  2: 48.48,
  3: 9.18,
  4: 7.0,
  5: 10.05,
  6: 9.1,
  7: 2.14,
  8: 16.0,
  9: 11.68}

In this case I would like the output to be something like:

2022-01 shop 11.68
2022-04 Taxi    64.48
    shop    23.00
    MOBILE  7.00
    Restaurant  19.15
2022-05 School  15.00
    Cleaner 26.00

Upvotes: 0

Views: 521

Answers (2)

Naveed
Naveed

Reputation: 11650

Try this out, if you had shared the MRE, i would have validated and shared the resultset

# create a ym column and use that in your groupby
(df.assign(ym=pd.to_datetime(df['Transaction Date']).dt.strftime('%Y-%m'))
 .groupby(['ym','Transaction Description' ] )['Debit Amount'].sum() 
)
ym       Transaction Description
2022-01  shop                       11.68
2022-04  MOBILE                      7.00
         Restaurant                 19.15
         Taxi                       64.48
         shop                       11.32
2022-05  Cleaner                    26.00
         School                     15.00
Name: Debit Amount, dtype: float64

OR

(df.assign(ym=pd.to_datetime(df['Transaction Date']).dt.strftime('%Y-%m'))
 .groupby(['ym','Transaction Description' ], as_index=False )['Debit Amount'].sum()
)

        ym  Transaction Description     Debit Amount
0   2022-01                    shop            11.68
1   2022-04                    MOBILE           7.00
2   2022-04                    Restaurant      19.15
3   2022-04                    Taxi            64.48
4   2022-04                    shop            11.32
5   2022-05                    Cleaner         26.00
6   2022-05                    School          15.00

Upvotes: 1

Code Different
Code Different

Reputation: 93151

Use pd.Grouper. I assume your Transaction Date is of type date:

df.groupby([pd.Grouper(key="Transaction Date", freq="MS"), "Transaction Description"]).sum()

Upvotes: 2

Related Questions