Reputation: 524
I have a dataframe containing hourly data, i want to get the max for each week of the year, so i used resample to group data by week
weeks = data.resample("W").max()
the problem is that week max is calculated starting the first monday of the year, while i want it to be calculated starting the first day of the year.
I obtain the following result, where you can notice that there is 53 weeks, and the last week is calculated on the next year while 2017 doesn't exist in the data
Date dots
2016-01-03 0.647786
2016-01-10 0.917071
2016-01-17 0.667857
2016-01-24 0.669286
2016-01-31 0.645357
Date dots
2016-12-04 0.646786
2016-12-11 0.857714
2016-12-18 0.670000
2016-12-25 0.674571
2017-01-01 0.654571
is there a way to calculate week for pandas dataframe starting first day of the year?
Upvotes: 1
Views: 2926
Reputation: 187
Find the starting day of the year, for example let say it's Friday, and then you can specify an anchoring suffix to resample in order to calculate week starting first day of the year:
weeks = data.resample("W-FRI").max()
Upvotes: 3
Reputation: 150735
One quick remedy is, given you data in one year, you can group it by day first, then take group of 7 days:
new_df = (df.resample("D", on='Date').dots
.max().reset_index()
)
new_df.groupby(new_df.index//7).agg({'Date': 'min', 'dots': 'max'})
new_df.head()
Output:
Date dots
0 2016-01-01 0.996387
1 2016-01-08 0.999775
2 2016-01-15 0.997612
3 2016-01-22 0.979376
4 2016-01-29 0.998240
5 2016-02-05 0.995030
6 2016-02-12 0.987500
and tail:
Date dots
48 2016-12-02 0.999910
49 2016-12-09 0.992910
50 2016-12-16 0.996877
51 2016-12-23 0.992986
52 2016-12-30 0.960348
Upvotes: 1