oleks5412
oleks5412

Reputation: 75

Pandas resample pulls weekends to Friday

I have a daily OCHL data which is available on business days and I am trying to resample it with 36-days periods to align it with my chart.

The data is in the following format:

    date    open    close   high    low
0   2019-05-01 21:00:00 0.70147 0.70023 0.70292 0.69952
1   2019-04-30 21:00:00 0.70476 0.70140 0.70610 0.70074
2   2019-04-29 21:00:00 0.70554 0.70498 0.70692 0.70308
3   2019-04-28 21:00:00 0.70380 0.70564 0.70609 0.70377
4   2019-04-25 21:00:00 0.70149 0.70434 0.70613 0.70074

I am doing resampling like this:

year_resampled = df_year.resample('36B').agg({'date':'first','open':'first','close':'last','high':'max','low':'min'})

The problem arises when the data spans weekends; I have an interval that ends on Thursday. There is a missing value for Friday and resample function pulls the data from Sunday to start a new interval. It starts a new interval from Sunday and I need to change the date back to Friday. Example:

 [datetime.datetime(2018, 7, 23, 21, 0), 0.7379, 0.74176, 0.7434, 0.73596],
 [datetime.datetime(2018, 7, 22, 21, 0), 0.74183, 0.73812, 0.74376, 0.73718], - Sunday, new interval starts. Here I want to change date to 2018-7-20
 [datetime.datetime(2018, 7, 19, 21, 0), 0.73572, 0.74167, 0.74309, 0.73182], -- Thursday (interval ends)

Upvotes: 1

Views: 2025

Answers (2)

oleks5412
oleks5412

Reputation: 75

I was looking at the date column while I should have looked at the index. After resampling, index puts dates in correct order as I wanted. I had a date aggregation because I was coercing dataframe to list but it was wrong to have a date column, so I removed it

year_resampled = df_year.resample('36B').agg({'open':'first','close':'last','high':'max','low':'min'})

and used index to pull dates into a list

Upvotes: 0

piRSquared
piRSquared

Reputation: 294258

Anchored Offsets

By default 'W' is for weeks starting on Sunday. You can change that by specifying 'W-Fri'

df.resample('W', on='date').first()

                 date     open    close     high      low
date                                                     
2019-04-28 2019-04-25  0.70149  0.70434  0.70613  0.70074
2019-05-05 2019-04-29  0.70554  0.70498  0.70692  0.70308

Versus

df.resample('W-Fri', on='date').first()

                 date     open    close     high      low
date                                                     
2019-04-26 2019-04-25  0.70149  0.70434  0.70613  0.70074
2019-05-03 2019-04-28  0.70380  0.70564  0.70609  0.70377

Upvotes: 1

Related Questions