Reputation: 43
I have to create a shift column based on existing time column.
For example, I have a dataframe df1 with details:
time
0 10:30
1 13:50
2 19:20
3 14:10
I need a dataframe which looks like below with shift condition:
time shift 0 10:30 1 1 13:50 2 2 19:20 2 3 23:10 3
Upvotes: 3
Views: 163
Reputation: 13274
The following uses a dictionary of shifts that helps determine the shift associated with a given time:
import pandas as pd
df = pd.DataFrame({'time': ['00:00','08:29', '08:30', '08:31', '12:29', '12:30', '12:31', '20:29', '20:30', '20:31', '23:59', '10:30', '13:50', '19:20', '14:10', '23:10']})
# Convert the time column into datetime objects
df.time = pd.to_datetime(df.time).dt.time
# Set up a shifts dictionary
shifts = {('8:30', '12:30'): 1 , ('12:30', '20:30'): 2, ('20:30', '8:30'): 3}
# Convert the keys to datetime objects
shifts = {tuple(map(pd.to_datetime, k)):v for k,v in shifts.items()}
# Expand the datetime objects beyond one day if the second element occurred after the first element
shifts = {(k if k[0].time() < k[1].time() else (k[0],k[1]+pd.to_timedelta('1day'))):v for k,v in shifts.items()}
# Determine shift
def get_shift(time):
try:
return shifts.get([k for k in shifts if time in pd.date_range(*k, freq='60S', closed='left').time][0])
except:
return 'No Shift'
# Use .apply on the time column to get the shift column
df['shift'] = df.time.apply(get_shift)
print(df)
Outputs:
# time shift
# 0 00:00:00 3
# 1 08:29:00 3
# 2 08:30:00 1
# 3 08:31:00 1
# 4 12:29:00 1
# 5 12:30:00 2
# 6 12:31:00 2
# 7 20:29:00 2
# 8 20:30:00 3
# 9 20:31:00 3
# 10 23:59:00 3
# 11 10:30:00 1
# 12 13:50:00 2
# 13 19:20:00 2
# 14 14:10:00 2
# 15 23:10:00 3
Upvotes: 1
Reputation: 210842
Assuming we have the following DF:
In [380]: df
Out[380]:
time
0 00:00
1 08:29
2 08:30
3 08:31
4 12:29
5 12:30
6 12:31
7 20:29
8 20:30
9 20:31
10 23:59
In [381]: df.dtypes
Out[381]:
time object
dtype: object
Consider this solution:
In [382]: bins = [-1, 830, 1230, 2030, 2400]
...: labels = [0,1,2,3]
...: df['shift'] = pd.cut(df.time.str.replace(':','').astype(int),
...: bins=bins, labels=labels, right=False)
...: df.loc[df['shift']==0, 'shift'] = 3
...:
In [383]: df
Out[383]:
time shift
0 00:00 3
1 08:29 3
2 08:30 1
3 08:31 1
4 12:29 1
5 12:30 2
6 12:31 2
7 20:29 2
8 20:30 3
9 20:31 3
10 23:59 3
Explanation:
time
to numeric value like 08:29
--> 829
, 12:31
--> 1231
, etc.[0,1,2,3]
NOTE: the labels must be unique, that's why we couldnt specify [3,1,2,3]
0
--> 3
as we must split the interval between 20:30 - 08:30
into two: 00:00 - 08:30
and 20:30 - 23:59:59
Upvotes: 0
Reputation: 50560
You can accomplish this by apply
ing a function to the creation of your shift
column.
import datetime
def check_shift(row):
shift_time = row[0]
if datetime.time(8, 30) <= shift_time <= datetime.time(12, 30):
return 1
elif datetime.time(12, 30) < shift_time <= datetime.time(20, 30):
return 2
else:
return 3
df['shift'] = df.apply(check_shift, axis='columns')
This will result in the following dataframe
time shift
0 10:30:00 1
1 13:50:00 2
2 19:20:00 2
3 14:10:00 2
If we adjust that last shift to be 23:10
(like your sample output), we get the following:
time shift
0 10:30:00 1
1 13:50:00 2
2 19:20:00 2
3 23:10:00 3
One important note here, I converted the time
column from a string to an actual time
type:
df['time'] = pd.to_datetime(df['time'], format="%H:%M").dt.time
Upvotes: 0