Reputation: 768
I have two columns in pandas that are Timestamps as UTC: StartDate
and EndDate
. The dates are for the task that was submitted for help. I am trying to calculate how many hours did these task exist until finished during work hours. My company is open from 8 am to 6 pm PST Monday-Friday.
I have attempted to make these calculations but I have run into a problem. I don't know how to only get the hours during work time. I have figured out how to the the hours overthe entire time period, but I don't know how to move forward with this. My current thought is to make an if statment in a for loop and check each value of the SLATable['Date Responded (Hours)']
column that is generated from my calculations, but last time I tried to edit value is in column indiviually python throwed me an error.
#8-6 PST to UTC time
officeOpen = pd.Timestamp("8:00:00.000").tz_localize('US/Pacific').tz_convert('utc')
officeClose = pd.Timestamp("18:00:00.000").tz_localize('US/Pacific').tz_convert('utc')
#get data from sql server
SLATable = pd.read_sql_query(SqlQuery,conn)
#calculate Date Responded
SLATable['Date Responded (Hours)'] = SLATable['EndDate'] - SLATable['StartDate']
SLATable['Date Responded (Hours)'] = round(SLATable['Date Responded (Hours)']/np.timedelta64(1,'h'), 2)
Currently, if I use the code above it would work for any task that was finished on the same day as it was created if it was created during working hours, but task created Monday and was finished Tuesday would have hours outside of work hours. Also if a task is created outside of office hours it would collect time untiled we addressed it during working hours.
These calculations are not designed to take in holidays of any country just Monday to Friday from 8 to 6.
example data if run through my calculation:
StartDate EndDate Date Responded (Hours)
2016-05-03 2016-05-03 0.13
15:51:11.850 15:59:13.017
2016-05-05 2016-05-06 17.64
23:01:51.023 16:40:21.350
what the output should be if calculated correctly:
StartDate EndDate Date Responded (Hours)
2016-05-03 2016-05-03 0.13
15:51:11.850 15:59:13.017
2016-05-05 2016-05-06 0.32
23:01:51.023 16:40:21.350
Upvotes: 1
Views: 1844
Reputation: 31011
The first step is to define a BusinessHour offset, with proper start / end hours (it will be used later):
bhOffs = pd.offsets.BusinessHour(start='08:00', end='18:00')
Then define a function computing business time from UTC time, with proper TimeZone offset:
def BusTime(ts, hOffs, fwd):
'''Compute business time. Params:
ts - UTC time (string or Timestamp)
hOffs - Hour offset (int)
fwd - Roll variant (forward / backward, bool)
'''
tsWrk = ts if type(ts) == 'str' else pd.Timestamp(ts)
tsOffs = tsWrk + np.timedelta64(hOffs, 'h')
if fwd: # Roll if on End of Day
tsRoll = bhOffs.rollforward(tsOffs + np.timedelta64(1, 'ms'))
else: # Don't roll if on End of Day
tsRoll = bhOffs.rollforward(tsOffs - np.timedelta64(1, 'ms'))
return tsRoll if tsRoll.day != tsOffs.day else tsOffs
And as the last step, define a function computing business hours:
def BusHrs(ts1, ts2, hOffs=0):
'''Compute business hours between 2 DateTimes. Params:
ts1, ts2 - From / To (UTC, Timestamp or string)
hOffs - Hour offset (int)
'''
t1 = BusTime(ts1, hOffs, True)
t2 = BusTime(ts2, hOffs, False)
bHrs = pd.date_range(start=t1.floor('h'), end=t2.floor('h'),
freq=bhOffs, closed='left').size
frac1 = t1 - t1.normalize() - np.timedelta64(t1.hour, 'h')
frac2 = t2 - t2.normalize() - np.timedelta64(t2.hour, 'h')
return bHrs + (frac2 - frac1) / np.timedelta64(1, 'h')
The idea is to:
I performed the test on your data:
BusHrs('2016-05-03 15:51:11.850', '2016-05-03 15:59:13.017', -7)
- result
0.1336575.BusHrs('2016-05-05 23:01:51.023', '2016-05-06 16:40:21.350', -7)
- result
3.6417574999999998.The second result is different from your expected result, but the rationale is as follows:
2016-05-05 23:01
UTC is 2016-05-05 16:01
(Pacific).2016-05-06 16:40
UTC is 2016-05-06 09:40
(Pacific).I didn't test this function on your third set of start / end times, as there is probably something wrong with it (the actual work time is far above your expected result).
Upvotes: 2