FooBar
FooBar

Reputation: 16488

Pandas: weekly to daily, but not closed

I have a pandas dataframe:

z = pd.DataFrame(np.arange(6).reshape(2, 3).T , columns=['key', 'val'])
z['year'] = [2012, 2012, 2012]
z['week'] = ['01', '02', '03']
z['date'] = pd.to_datetime(z['year'].astype(str) + '-' + z['week'] + '-1', format='%Y-%W-%w') 

What I want to do is to resample every observation into weekly, i.e. replacing a one-week observation with 7 one-day observations, and to do so by key.

First, without the key:

In [24]: z.set_index('date').resample('d').ffill()                                                                                 
Out[24]: 
            key  val  year week
date                           
2012-01-02    0    3  2012   01
2012-01-03    0    3  2012   01
2012-01-04    0    3  2012   01
2012-01-05    0    3  2012   01
2012-01-06    0    3  2012   01
2012-01-07    0    3  2012   01
2012-01-08    0    3  2012   01
2012-01-09    1    4  2012   02
2012-01-10    1    4  2012   02
2012-01-11    1    4  2012   02
2012-01-12    1    4  2012   02
2012-01-13    1    4  2012   02
2012-01-14    1    4  2012   02
2012-01-15    1    4  2012   02
2012-01-16    2    5  2012   03

Clearly, it's treating the last observation as a closing boundary, instead of repeating for that week as well.

This becomes even more clear when I do the groupby:

In [25]: z.set_index('date').groupby('key').resample('d').ffill()                                                                  
Out[25]: 
                key  val  year week
key date                           
0   2012-01-02    0    3  2012   01
1   2012-01-09    1    4  2012   02
2   2012-01-16    2    5  2012   03

Since there is only one observation per group, resampling doesn't extend the observations at all.

How do I ensure it duplicates for each weak and doesn't use the last week as boundary? I found the argument closed={'left', 'right'}, but I don't want either to be treated as a closed boundary, rather as an open one -- if that makes sense.

Upvotes: 1

Views: 59

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195438

I hope I've understood you well: You can create a date_range with Week offset and then explode on it:

from pandas.tseries.offsets import Week

z["date"] = z["date"].apply(lambda x: pd.date_range(x, x + Week(weekday=6)))
z = z.explode("date")
print(z)

Prints:

   key  val  year week       date
0    0    3  2012   01 2012-01-02
0    0    3  2012   01 2012-01-03
0    0    3  2012   01 2012-01-04
0    0    3  2012   01 2012-01-05
0    0    3  2012   01 2012-01-06
0    0    3  2012   01 2012-01-07
0    0    3  2012   01 2012-01-08
1    1    4  2012   02 2012-01-09
1    1    4  2012   02 2012-01-10
1    1    4  2012   02 2012-01-11
1    1    4  2012   02 2012-01-12
1    1    4  2012   02 2012-01-13
1    1    4  2012   02 2012-01-14
1    1    4  2012   02 2012-01-15
2    2    5  2012   03 2012-01-16
2    2    5  2012   03 2012-01-17
2    2    5  2012   03 2012-01-18
2    2    5  2012   03 2012-01-19
2    2    5  2012   03 2012-01-20
2    2    5  2012   03 2012-01-21
2    2    5  2012   03 2012-01-22

Upvotes: 1

Related Questions