Reputation: 2871
I have a data frame as shown below
session appt_time
s1 2020-01-06 09:00:00
s1 2020-01-06 09:20:00
s1 2020-01-06 09:40:00
s1 2020-01-06 09:40:00
s1 2020-01-06 10:00:00
s1 2020-01-06 10:00:00
s2 2020-01-06 08:20:00
s2 2020-01-06 08:40:00
s2 2020-01-06 08:40:00
s2 2020-01-06 09:00:00
s2 2020-01-06 09:20:00
s2 2020-01-06 09:40:00
s2 2020-01-06 09:40:00
s2 2020-01-06 10:00:00
s3 2020-01-09 13:00:00
s3 2020-01-09 13:00:00
s3 2020-01-09 13:20:00
s3 2020-01-09 13:40:00
From the above I would like to create a new columns called ideal_appt_time and slot_num as shown below.
session appt_time ideal_appt_time slot_num
s1 2020-01-06 09:00:00 2020-01-06 09:00:00 1
s1 2020-01-06 09:20:00 2020-01-06 09:20:00 2
s1 2020-01-06 09:40:00 2020-01-06 09:40:00 3
s1 2020-01-06 09:40:00 2020-01-06 10:00:00 4
s1 2020-01-06 10:00:00 2020-01-06 10:20:00 5
s1 2020-01-06 10:00:00 2020-01-06 10:40:00 6
s2 2020-01-06 08:20:00 2020-01-06 08:20:00 1
s2 2020-01-06 08:40:00 2020-01-06 08:40:00 2
s2 2020-01-06 08:40:00 2020-01-06 09:00:00 3
s2 2020-01-06 09:00:00 2020-01-06 09:20:00 4
s2 2020-01-06 09:20:00 2020-01-06 09:40:00 5
s2 2020-01-06 09:40:00 2020-01-06 10:00:00 6
s2 2020-01-06 09:40:00 2020-01-06 10:20:00 7
s2 2020-01-06 10:00:00 2020-01-06 10:40:00 8
s3 2020-01-09 13:00:00 2020-01-09 13:00:00 1
s3 2020-01-09 13:00:00 2020-01-09 13:20:00 2
s3 2020-01-09 13:20:00 2020-01-09 13:40:00 3
s3 2020-01-09 13:40:00 2020-01-09 14:00:00 4
Explanation:
where ideal_appt_time is calculated based on appt_time, start of ideal_appt_time is same as appt_time. then adding 20 minutes to that, where as in appt_time some appt_time are repeating.
slot_num just counted the slot of that session based on appoitment time.
Upvotes: 1
Views: 31
Reputation: 862511
Use GroupBy.cumcount
for counter Series
, converted to timedeltas by to_timedelta
and multiple 20
for 20 Minutes
.
Then get first timestamp per group by GroupBy.transform
and GroupBy.first
, add timedeltas and last for counter column add 1
:
df['appt_time'] = pd.to_datetime(df['appt_time'])
counts = df.groupby('session').cumcount()
td = pd.to_timedelta(counts, unit='Min') * 20
df['ideal_appt_time'] = df.groupby('session')['appt_time'].transform('first') + td
df['slot_num'] = counts + 1
print (df)
session appt_time ideal_appt_time slot_num
0 s1 2020-01-06 09:00:00 2020-01-06 09:00:00 1
1 s1 2020-01-06 09:20:00 2020-01-06 09:20:00 2
2 s1 2020-01-06 09:40:00 2020-01-06 09:40:00 3
3 s1 2020-01-06 09:40:00 2020-01-06 10:00:00 4
4 s1 2020-01-06 10:00:00 2020-01-06 10:20:00 5
5 s1 2020-01-06 10:00:00 2020-01-06 10:40:00 6
6 s2 2020-01-06 08:20:00 2020-01-06 08:20:00 1
7 s2 2020-01-06 08:40:00 2020-01-06 08:40:00 2
8 s2 2020-01-06 08:40:00 2020-01-06 09:00:00 3
9 s2 2020-01-06 09:00:00 2020-01-06 09:20:00 4
10 s2 2020-01-06 09:20:00 2020-01-06 09:40:00 5
11 s2 2020-01-06 09:40:00 2020-01-06 10:00:00 6
12 s2 2020-01-06 09:40:00 2020-01-06 10:20:00 7
13 s2 2020-01-06 10:00:00 2020-01-06 10:40:00 8
14 s3 2020-01-09 13:00:00 2020-01-09 13:00:00 1
15 s3 2020-01-09 13:00:00 2020-01-09 13:20:00 2
16 s3 2020-01-09 13:20:00 2020-01-09 13:40:00 3
17 s3 2020-01-09 13:40:00 2020-01-09 14:00:00 4
Upvotes: 1