Reputation: 49
So I have a dataframe like this
df1 = {'id':[123,347],
'shift_start':['2022-01-25 09:00:00', '2022-01-28 19:00:00'],
'shift_end':['2022-01-25 12:30:00', '2022-01-29 00:00:00']}
df1 = pd.DataFrame(df1,columns=['id','shift_start', 'shift_end'])
id shift_start shift_end
123 2022-01-25 09:00:00 2022-01-28 12:30:00
347 2022-01-25 23:00:00 2022-01-29 00:00:00
And I would like to get for every id how many 30min slots I have in each shift, so something like this
id shift_start shift_end slot_start slot_end
123 2022-01-25 09:00:00 2022-01-28 12:30:00 2022-01-25 09:00:00 2022-01-25 09:30:00
123 2022-01-25 09:00:00 2022-01-28 12:30:00 2022-01-25 09:30:00 2022-01-25 10:00:00
123 2022-01-25 09:00:00 2022-01-28 12:30:00 2022-01-25 10:00:00 2022-01-25 10:30:00
123 2022-01-25 09:00:00 2022-01-28 12:30:00 2022-01-25 10:30:00 2022-01-25 11:00:00
123 2022-01-25 09:00:00 2022-01-28 12:30:00 2022-01-25 11:00:00 2022-01-25 11:30:00
123 2022-01-25 09:00:00 2022-01-28 12:30:00 2022-01-25 11:30:00 2022-01-25 12:00:00
123 2022-01-25 09:00:00 2022-01-28 12:30:00 2022-01-25 12:00:00 2022-01-25 12:30:00
347 2022-01-25 23:00:00 2022-01-29 00:00:00 2022-01-25 23:00:00 2022-01-29 23:30:00
347 2022-01-25 23:00:00 2022-01-29 00:00:00 2022-01-25 23:30:00 2022-01-29 00:00:00
I've found a way to find the 30min slots but I still cannot find the way to add them into the dataframe
def datetime_range(start, end, delta):
current = start
while current < end:
yield current
current += delta
for index, row in df1.iterrows():
dts = [dt.strftime('%Y-%m-%d %H:%M:%S') for dt in
datetime_range(datetime(row['shift_start'].year, row['shift_start'].month, row['shift_start'].day, row['shift_start'].hour, row['shift_start'].minute),
datetime(row['shift_end'].year, row['shift_end'].month, row['shift_end'].day, row['shift_end'].hour, row['shift_end'].minute),
timedelta(minutes=30))]
print(dts)
Any help will be very welcome. Thanks!
Upvotes: 1
Views: 57
Reputation: 3280
Here is one way:
slots = df1[['shift_start', 'shift_end']]
.apply(lambda x: pd.date_range(*x, freq='30min'), axis=1)
.explode()
.rename('slot_start')
.to_frame()
slots['slot_end'] = slots['slot_start'].shift(-1)
df1 = pd.concat([df1, slots], axis=1).dropna()
df1 = df1[(df1.shift_end >= df1.slot_end)]
Output:
id shift_start shift_end slot_start slot_end
0 123 2022-01-25 09:00:00 2022-01-25 12:30:00 2022-01-25 09:00:00 2022-01-25 09:30:00
0 123 2022-01-25 09:00:00 2022-01-25 12:30:00 2022-01-25 09:30:00 2022-01-25 10:00:00
0 123 2022-01-25 09:00:00 2022-01-25 12:30:00 2022-01-25 10:00:00 2022-01-25 10:30:00
0 123 2022-01-25 09:00:00 2022-01-25 12:30:00 2022-01-25 10:30:00 2022-01-25 11:00:00
0 123 2022-01-25 09:00:00 2022-01-25 12:30:00 2022-01-25 11:00:00 2022-01-25 11:30:00
0 123 2022-01-25 09:00:00 2022-01-25 12:30:00 2022-01-25 11:30:00 2022-01-25 12:00:00
0 123 2022-01-25 09:00:00 2022-01-25 12:30:00 2022-01-25 12:00:00 2022-01-25 12:30:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 19:00:00 2022-01-28 19:30:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 19:30:00 2022-01-28 20:00:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 20:00:00 2022-01-28 20:30:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 20:30:00 2022-01-28 21:00:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 21:00:00 2022-01-28 21:30:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 21:30:00 2022-01-28 22:00:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 22:00:00 2022-01-28 22:30:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 22:30:00 2022-01-28 23:00:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 23:00:00 2022-01-28 23:30:00
1 347 2022-01-28 19:00:00 2022-01-29 00:00:00 2022-01-28 23:30:00 2022-01-29 00:00:00
Upvotes: 2