Reputation: 1294
I have the following dataframe (50000 lines for 2016,2017,2018):
data.head()
Values
Date
2016-07-20 10.0
2016-07-20 10.0
2016-07-22 1.0
2016-07-22 1.0
2016-07-22 1.0
simple groupby
:
data.groupby([(data1.index.year==2018),(data1.index.month==5)])['Values'].sum()
False False 1461787.35
True 80714.53
True False 862589.66
True 172600.04
Name: Values, dtype: float64
How can I change False
and True
in year
and month
to 2018 and 5 ?
Output expected :
2018 5 80714.53
Name: Values, dtype: float64
Upvotes: 2
Views: 113
Reputation: 862481
You need grouping by year
s and month
s only:
#changed data for better sample
print (data)
Values
Date
2015-07-20 10.0
2015-05-20 20.0
2016-05-20 2.0
2018-05-22 1.0
2018-05-22 3.0
2018-07-22 4.0
df1 = data.groupby([data.index.year, data.index.month])['Values'].sum()
print (df1)
Date Date
2015 5 20.0
7 10.0
2016 5 2.0
2018 5 4.0
7 4.0
Name: Values, dtype: float64
Your solution grouping by boolean mask, True
are filtered only 2018
years and False
s are all non 2018
rows, similar for month
s:
df2 = data.groupby([(data.index.year==2018),(data.index.month==5)])['Values'].sum()
print (df2)
False False 10.0
True 22.0
True False 4.0
True 4.0
Name: Values, dtype: float64
If want only filtered values is possible use:
df = data[(data.index.year==2018) & (data.index.month==5)]
print (df)
Values
Date
2018-05-22 1.0
2018-05-22 3.0
out = df.groupby([df.index.year, df.index.month])['Values'].sum()
print (out)
Date Date
2018 5 4.0
Name: Values, dtype: float64
Upvotes: 2