Reputation: 772
I've the following data sample:
Hours Per Work Week | Start Date | End Date |
---|---|---|
10 | 2021-08-01 | 2021-08-31 |
20 | 2021-08-15 | 2021-09-15 |
40 | 2021-09-01 | 2021-09-30 |
I need to get a total by week distribution so the expected dataframe would be something:
total_hours | |
---|---|
week | |
2021-08-01 | 30 |
2021-08-08 | 30 |
2021-08-15 | 30 |
2021-08-22 | 30 |
2021-08-29 | 30 |
2021-09-05 | 60 |
2021-09-12 | 60 |
2021-09-19 | 40 |
2021-09-26 | 40 |
I get how to get the index for weeks:
all_dates = pd.date_range('2021-08-01', '2021-09-30', freq='W')
but then how to use the weeks to sum from the dataset that fits in the range I'm not sure.
Thank you!!
Upvotes: 1
Views: 663
Reputation: 120559
IIUC:
weeks = df.apply(lambda x: pd.date_range(x['Start Date'], x['End Date'], freq='W'),
axis='columns').explode().rename('week')
out = df[['Hours Per Work Week']].join(weeks).groupby('week').sum()
>>> weeks
Hours Per Work Week
week
2021-08-01 50
2021-08-08 50
2021-08-15 70
2021-08-22 70
2021-08-29 70
2021-09-05 60
2021-09-12 60
2021-09-19 40
2021-09-26 40
Upvotes: 2