usual me
usual me

Reputation: 8778

Pandas - Choose the starting day when resampling every 2 weeks

Say I have the following time series, which starts on 2014-06-01 which is a Sunday.

In [7]:

# 2014-06-01 is Sunday
df = pd.Series( index=pd.date_range( '2014-06-01', periods=30 ), data=nr.randn( 30 ) ) #
df

I can resample weekly, starting on Sundays and closing on Saturdays:

In [9]:

df.resample( 'W-SAT' )
Out[9]:
2014-06-07    0.119460
2014-06-14    0.464789
2014-06-21   -1.211579
2014-06-28    0.650210
2014-07-05    0.666044
Freq: W-SAT, dtype: float64

Ok now I want to the same thing but every 2 weeks, so I try this:

In [11]:

df.resample( '2W-SAT' )
Out[11]:
2014-06-07    0.119460
2014-06-21   -0.373395
2014-07-05    0.653729
Freq: 2W-SAT, dtype: float64

Oh, the output is 1 week and then 2 weeks and 2 weeks. That's not what I expected. I was expecting the first index entry to be '2014-06-14'. Why is it doing that? How do I get the first 2 weeks to be resampled together?

Upvotes: 7

Views: 7453

Answers (2)

usual me
usual me

Reputation: 8778

After trying the various options of resample, I might have an explanation. The way resample chooses the first entry of the new resampled index seems to depend on the closed option:

  • when closed=left, resample looks for the latest possible start
  • when closed=right, resample looks for the earliest possible start

I will illustrate with an example:

# 2014-06-01 is Sunday
df = pd.Series( index=pd.date_range( '2014-06-01', periods=30 ), data=range(1 , 31 ) ) #
df

The following example illustrates the behaviour of closed=left. The latest "left-side" Saturday of a 2 weeks interval closed on the left happens on 2014-05-31, as shown by the following:

df.resample( '2W-SAT',how='sum', closed='left', label='left' )
Out[119]:
2014-05-31     91
2014-06-14    287
2014-06-28     87
Freq: 2W-SAT, dtype: int64

The next example illustrates the behaviour of closed=right, which is the one that I didn't understand in my initial post (closed=right by default in resample). The earliest "right-side" Saturday of a 2 weeks interval closed on the right happens on 2014/06/07, as shown by the following:

df.resample( '2W-SAT',how='sum', closed='right', label='right' )
Out[122]:
2014-06-07     28
2014-06-21    203
2014-07-05    234
Freq: 2W-SAT, dtype: int64

Upvotes: 9

Thomas Cokelaer
Thomas Cokelaer

Reputation: 1059

The first saturday of the month of june 2014 is the 7th, so it starts on the seventh. If you try with sunday, it starts on the first of june as expected.

df.resample( '2W-SUN' )
Out[11]: 
2014-06-01    0.739895
2014-06-15    0.497950
2014-06-29    0.445480
2014-07-13    0.767430
Freq: 2W-SUN, dtype: float64

Upvotes: 0

Related Questions