Reputation: 7577
I have a Pandas dataframe with a datetime column. My problem is the following:
I have a starting date of 04/08/2014. Since then, I count weeks in chunks of 16 weeks. So, from 04/08/2014 until 11/08/2014, it will be week 1. After 16 weeks, it will start again from week 1. I want to create a new column where it will find the week of the current chunk based on the datetime column.
This is what I have done, but it seems that it doesn't work as it should.
startingweek = datetime.date(2014, 8, 4)
df['WeekChunk'] = int(((df['DateTimeColumn'] - startingweek) / pd.offsets.Day(1))/7/16)
I calculated the number of days between the two days, then divided by 7 days to find the number of weeks and then divided by 16 to find the week of chunk.
If I use a date of 23/12/2015, it should be week 9. But, the above code seems wrong.
Upvotes: 0
Views: 1091
Reputation: 1033
Here is a way to do it using built-in numpy/pandas time series functionality without using modulo operator:
import pandas as pd
import numpy as np
# re-create a dummy df with a Date column for this example
startingweek = datetime.date(2014, 8, 4)
df = pd.DataFrame(pd.date_range(startingweek, periods=1000, freq='D'), columns=['Date'])
# calc the WeekChunks by recasting the differences into timedelta periods
df['WeekChunks'] = 1 + (df.Date-startingweek).astype(np.timedelta64(1,'W'))
- (df.Date-startingweek).astype(np.timedelta64(1,'16W'))*16
# find Dec 23, 2015...should be WeekChunks = 9
df.set_index('Date').ix['2015-12-23']
WeekChunks 9
Name: 2015-12-23 00:00:00, dtype: float64
Upvotes: 0
Reputation: 973
If you need the week in a period of 16, you need the modulo, not devision. So change "/" to "%". And get int() before that.
df['WeekChunk'] = int(((df['DateTimeColumn'] - startingweek) / pd.offsets.Day(1))/7) % 16
P.S. But the first week would be 0, not 1.
Upvotes: 1