Reputation: 97
This is a tough question. I have a data-frame that looks something like this:
Employee Date Hours1 Reg OT
0 emp1 08/12/2018 10.00 8.00 2.00
1 emp1 08/13/2018 10.00 8.00 2.00
2 emp1 08/14/2018 10.00 8.00 2.00
3 emp1 08/15/2018 10.00 8.00 2.00
4 emp1 08/17/2018 10.00 8.00 2.00
5 emp2 08/12/2018 7.00 7.00 0.00
6 emp2 08/13/2018 9.00 8.00 1.00
7 emp2 08/15/2018 10.00 8.00 2.00
8 emp2 08/17/2018 10.00 8.00 2.00
9 emp2 08/18/2018 8.00 8.00 0.00
What I am trying to do is calculate over-time for employees in California. California law says anything over 8 hours in a single day and/or anything over 40 hours in a single week is considered over-time.
Hours1
= total hours worked that day.
Reg = np.where((df['Hours1']) < 8, df['Hours1'], (df['OT']))
(Regular time hours)
OT = np.where((df['Hours1']) > 8, (df['Hours1']) - 8, (df['OT']))
(OT hours)
The issue occurs when an employee works over 40 hours in a week and has worked over 8 hours in a single day. I am having a hard time combining the two formulas. As you can see, emp1
worked 40 hours in the first 4 days, so all hours in row=4
should be considered overtime.
For each date, I would like to add up all the hours under Hours1
, for each employee, and when the amount >=40 then all subsequent hours are considered OT and, for that employee, no hours should be considered regular any longer. It feels like there should be a loop for each row, for each employee, that contains this expression but everything I have tried has not worked.
Ouput should like like this:
Employee Date Hours1 Reg OT
0 emp1 08/12/2018 10.00 8.00 2.00
1 emp1 08/13/2018 10.00 8.00 2.00
2 emp1 08/14/2018 10.00 8.00 2.00
3 emp1 08/15/2018 10.00 8.00 2.00
4 emp1 08/17/2018 10.00 0.00 10.00
5 emp2 08/12/2018 7.00 7.00 0.00
6 emp2 08/13/2018 9.00 8.00 1.00
7 emp2 08/15/2018 10.00 8.00 2.00
8 emp2 08/17/2018 10.00 8.00 2.00
9 emp2 08/18/2018 8.00 4.00 4.00
Any and all help is appreciated.
For future readers the key to the equation was cumsum
. The final code looked like this (Note: I dropped reg hours):
df['Cum hours'] = df.groupby('Employee')['Hours1'].transform('cumsum')
df['Week OT'] = np.where(df['Cum hours'] - df['Hours1'] > 40,
df['Hours1'],
df['Cum hours'] - 40)
df['OT'] = np.where(df['Cum hours'] > 40,
df['Week OT'],
df['OT'])
df = df.loc[:, ['Employee', 'Date', 'Hours1', 'OT']]
Upvotes: 1
Views: 566
Reputation: 1509
Assuming the dataframe only contains data from a single week, do:
cumulative_hours = df.groupby('Employee')['Hours1'].cumsum()
ot_within_day = df.OT
ot_within_week = np.where(cumulative_hours - df.Hours1 > 40,
df.Hours1,
cumulative_hours - 40)
ot_CA = np.where(cumulative_hours > 40,
ot_within_week,
ot_within_day)
Upvotes: 3