Reputation: 43

Create a new column base on existing time column in a dataframe

I have to create a shift column based on existing time column.

For example, I have a dataframe df1 with details:

I need a dataframe which looks like below with shift condition:

8:30 to 12:30 = shift 1,
12:30 to 20:20 = shift 2
20:30 to 8:30 = shift 3

   time shift
0  10:30    1
1  13:50    2
2  19:20    2
3  23:10    3

Upvotes: 3

Answers (3)

Abdou

Reputation: 13274

The following uses a dictionary of shifts that helps determine the shift associated with a given time:

import pandas as pd


df = pd.DataFrame({'time': ['00:00','08:29', '08:30', '08:31', '12:29', '12:30', '12:31', '20:29', '20:30', '20:31', '23:59', '10:30', '13:50', '19:20', '14:10', '23:10']})

# Convert the time column into datetime objects
df.time = pd.to_datetime(df.time).dt.time

# Set up a shifts dictionary
shifts = {('8:30', '12:30'): 1 , ('12:30', '20:30'): 2, ('20:30', '8:30'): 3}

# Convert the keys to datetime objects
shifts = {tuple(map(pd.to_datetime, k)):v for k,v in shifts.items()}

# Expand the datetime objects beyond one day if the second element occurred after the first element
shifts = {(k if k[0].time() < k[1].time() else (k[0],k[1]+pd.to_timedelta('1day'))):v for k,v in shifts.items()}

# Determine shift
def get_shift(time):
    try:
        return shifts.get([k for k in shifts if time in pd.date_range(*k, freq='60S', closed='left').time][0])
    except:
        return 'No Shift'

# Use .apply on the time column to get the shift column
df['shift'] = df.time.apply(get_shift)

print(df)

Outputs:

#         time  shift
# 0   00:00:00      3
# 1   08:29:00      3
# 2   08:30:00      1
# 3   08:31:00      1
# 4   12:29:00      1
# 5   12:30:00      2
# 6   12:31:00      2
# 7   20:29:00      2
# 8   20:30:00      3
# 9   20:31:00      3
# 10  23:59:00      3
# 11  10:30:00      1
# 12  13:50:00      2
# 13  19:20:00      2
# 14  14:10:00      2
# 15  23:10:00      3

Upvotes: 1

MaxU - stand with Ukraine

Reputation: 210842

Assuming we have the following DF:

In [380]: df
Out[380]:
     time
0   00:00
1   08:29
2   08:30
3   08:31
4   12:29
5   12:30
6   12:31
7   20:29
8   20:30
9   20:31
10  23:59

In [381]: df.dtypes
Out[381]:
time    object
dtype: object

Consider this solution:

In [382]: bins = [-1, 830, 1230, 2030, 2400]
     ...: labels = [0,1,2,3]
     ...: df['shift'] = pd.cut(df.time.str.replace(':','').astype(int),
     ...:                      bins=bins, labels=labels, right=False)
     ...: df.loc[df['shift']==0, 'shift'] = 3
     ...:

In [383]: df
Out[383]:
     time shift
0   00:00     3
1   08:29     3
2   08:30     1
3   08:31     1
4   12:29     1
5   12:30     2
6   12:31     2
7   20:29     2
8   20:30     3
9   20:31     3
10  23:59     3

Explanation:

first we convert time to numeric value like 08:29 --> 829, 12:31 --> 1231, etc.
now we can cut them into 4 bins (shifts): [0,1,2,3] NOTE: the labels must be unique, that's why we couldnt specify [3,1,2,3]
finally we should change 0 --> 3 as we must split the interval between 20:30 - 08:30 into two: 00:00 - 08:30 and 20:30 - 23:59:59

Upvotes: 0

Andy

Reputation: 50560

You can accomplish this by applying a function to the creation of your shift column.

import datetime

def check_shift(row):
    shift_time = row[0]
    if datetime.time(8, 30) <= shift_time <= datetime.time(12, 30):
        return 1
    elif datetime.time(12, 30) < shift_time <= datetime.time(20, 30):
        return 2
    else:
        return 3

df['shift'] = df.apply(check_shift, axis='columns')

This will result in the following dataframe

       time  shift
0  10:30:00      1
1  13:50:00      2
2  19:20:00      2
3  14:10:00      2

If we adjust that last shift to be 23:10 (like your sample output), we get the following:

       time  shift
0  10:30:00      1
1  13:50:00      2
2  19:20:00      2
3  23:10:00      3

One important note here, I converted the time column from a string to an actual time type:

df['time'] = pd.to_datetime(df['time'], format="%H:%M").dt.time

Upvotes: 0

Create a new column base on existing time column in a dataframe

Answers (3)

Related Questions