Math
Math

Reputation: 1294

Pandas groupby get month and year values

I have the following dataframe (50000 lines for 2016,2017,2018):

data.head()
             Values
Date              
2016-07-20    10.0
2016-07-20    10.0
2016-07-22     1.0
2016-07-22     1.0
2016-07-22     1.0

simple groupby :

data.groupby([(data1.index.year==2018),(data1.index.month==5)])['Values'].sum()

False  False   1461787.35
       True      80714.53
True   False    862589.66
       True     172600.04
Name: Values, dtype: float64

How can I change False and True in year and month to 2018 and 5 ?

Output expected :

    2018     5   80714.53
    Name: Values, dtype: float64

Upvotes: 2

Views: 113

Answers (1)

jezrael
jezrael

Reputation: 862481

You need grouping by years and months only:

#changed data for better sample
print (data)

            Values
Date              
2015-07-20    10.0
2015-05-20    20.0
2016-05-20     2.0
2018-05-22     1.0
2018-05-22     3.0
2018-07-22     4.0

df1 = data.groupby([data.index.year, data.index.month])['Values'].sum()
print (df1)
Date  Date
2015  5       20.0
      7       10.0
2016  5        2.0
2018  5        4.0
      7        4.0
Name: Values, dtype: float64

Your solution grouping by boolean mask, True are filtered only 2018 years and Falses are all non 2018 rows, similar for months:

df2 = data.groupby([(data.index.year==2018),(data.index.month==5)])['Values'].sum()
print (df2)
False  False    10.0
       True     22.0
True   False     4.0
       True      4.0
Name: Values, dtype: float64

If want only filtered values is possible use:

df = data[(data.index.year==2018) & (data.index.month==5)]
print (df)
            Values
Date              
2018-05-22     1.0
2018-05-22     3.0

out = df.groupby([df.index.year, df.index.month])['Values'].sum()
print (out)
Date  Date
2018  5       4.0
Name: Values, dtype: float64

Upvotes: 2

Related Questions