SCool
SCool

Reputation: 3375

How to group transactions by year and month?

I've checked all the suggested answers here but each one I try I think I break my code more.

Sample of my data:

print(transactions.head())

                      loc  amount  local_date
0                    RAIL     8.1  2016-09-30
1      LINK NETWORK LIMIT     4.0  2016-10-02
2          CHOCOLATE CAFE     3.0  2016-10-03
3         Four Star Pizza     9.7  2016-10-03
4                  Cinema    10.0  2016-10-04

I just want to group by year, and month and sum transactions.

For example my expected outcome:

2019 Jan  100
     Feb  123 
     Mar  150
     etc.
2018 Jan  200
     Feb  150
     Mar  211
     etc.

And what I have tried (basically all the suggested answers)

transactions.set_index('local_date').groupby([(transactions.index.year),(transactions.index.month)])['amount'].sum()


AttributeError                            Traceback (most recent call last)
<ipython-input-332-64938cfdee85> in <module>
----> 1 transactions.set_index('local_date').groupby([(transactions.index.year),(transactions.index.month)])['amount'].sum()

AttributeError: 'RangeIndex' object has no attribute 'year'
transactions.set_index('local_date').groupby([(transactions.index.dt.year),(transactions.index.dt.month)])['amount'].sum()

AttributeError                            Traceback (most recent call last)
<ipython-input-334-150c05241676> in <module>
----> 1 transactions.set_index('local_date').groupby([(transactions.index.dt.year),(transactions.index.dt.month)])['amount'].sum()


AttributeError: 'RangeIndex' object has no attribute 'dt'
transactions.set_index('local_date').groupby([(transactions.index.to_series().dt.year),(transactions.index.to_series.()dt.month)])['amount'].sum()


AttributeError: Can only use .dt accessor with datetimelike values

I'm losing my mind. Where am I going wrong?

Upvotes: 0

Views: 409

Answers (2)

ansev
ansev

Reputation: 30930

Use Series.dt + DataFrame.groupby:

df['local_date']=pd.to_datetime(df['local_date'])
df.groupby([df['local_date'].dt.year,df['local_date'].dt.month])['amount'].sum()

local_date  local_date
2016        9              8.1
            10            26.7

if you prefere show the name of the month:

new_df=df.groupby([df['local_date'].dt.year,df['local_date'].dt.month_name()])['amount'].sum().to_frame('Total amount')
print(new_df)
                       Total amount
local_date local_date              
2016       October             26.7
           September            8.1

new_df=df.groupby(df['local_date'].dt.to_period('M')).amount.sum().to_frame('Total_amount')
print(new_df)

            Total_amount
local_date              
2016-09              8.1
2016-10             26.7

Upvotes: 3

Quang Hoang
Quang Hoang

Reputation: 150785

You can groupby year, month:

(transaction.groupby([transaction.local_date.dt.year,
                     transaction.local_date.dt.month])
          .sum())

Output:

                       amount
local_date local_date        
2016       9              8.1
           10            26.7

If you want the month name, replace .dt.month with .dt.month_name(), but you will need to do extra for the correct ordering:

                       amount
local_date local_date        
2016       October       26.7
           September      8.1

Upvotes: 2

Related Questions