jonboy
jonboy

Reputation: 372

Automate shift times whilst accounting for constraints

I have a script that produces automated shift times based on availability and various constraints. These being:

  1. At any given time period, you must meet the minimum staffing requirements
  2. A person has a minimum and maximum amount of hours they can do
  3. An employee can only be scheduled to work within their available hours
  4. A person can only work one shift per day
  5. A person can start no later than 8PM

To provide an overview of the process, the staff_availability df contains the employees to choose from ['Person'], the available min - max hours they can work ['MinHours']-['MaxHours'], how much they get paid ['HourlyWage'], and availability, expressed as hours ['Availability_Hr'] and 15min segments ['Availability_15min_Seg'].

The staffing_requirements df contains the time of day ['Time'] and the staff required ['People'] during those periods.

The script returns a df 'availability_per_member' that displays how many employees are available at each point in time. So 1 indicates available to be scheduled and 0 indicates not available. It then aims to allocate shift times, while accounting for the constraints using pulp.

The question I have is regarding the 5th constraint. It is a coding problem. I have commented this out so the script works. The constraint and error are posted below:

# Do not start people later than 8PM
for timeslot in timeslots:
    prob += (sum([staffed[(timeslot, person)] for person in persons])
    <= staffing_requirements.loc[person, 'Availability_Hr'] <= 52)

Error:

KeyError: 'the label [C11] is not in the [index]'

Script:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates


staffing_requirements = pd.DataFrame({
    'Time' : ['0/1/1900 8:00:00','0/1/1900 9:59:00','0/1/1900 10:00:00','0/1/1900 12:29:00','0/1/1900 12:30:00','0/1/1900 13:00:00','0/1/1900 13:02:00','0/1/1900 13:15:00','0/1/1900 13:20:00','0/1/1900 18:10:00','0/1/1900 18:15:00','0/1/1900 18:20:00','0/1/1900 18:25:00','0/1/1900 18:45:00','0/1/1900 18:50:00','0/1/1900 19:05:00','0/1/1900 19:07:00','0/1/1900 21:57:00','0/1/1900 22:00:00','0/1/1900 22:30:00','0/1/1900 22:35:00','1/1/1900 3:00:00','1/1/1900 3:05:00','1/1/1900 3:20:00','1/1/1900 3:25:00'],                 
    'People' : [1,1,2,2,3,3,2,2,3,3,4,4,3,3,2,2,3,3,4,4,3,3,2,2,1],                      
    })

staff_availability = pd.DataFrame({
    'Person' : ['C1','C2','C3','C4','C5','C6','C7','C8','C9','C10','C11'],                 
    'MinHours' : [5,5,5,5,5,5,5,5,5,5,5],    
    'MaxHours' : [10,10,10,10,10,10,10,10,10,10,10],                 
    'HourlyWage' : [26,26,26,26,26,26,26,26,26,26,26],  
    'Availability_Hr' : ['8-18','8-18','8-18','9-18','9-18','9-18','12-1','12-1','17-3','17-3','17-3'],                              
    'Availability_15min_Seg' : ['1-41','1-41','1-41','5-41','5-41','5-41','17-69','17-79','37-79','37-79','37-79'],                              
    })

''' Generate availability at each point in time '''

staffing_requirements['Time'] = ['/'.join([str(int(x.split('/')[0])+1)] + x.split('/')[1:]) for x in staffing_requirements['Time']]
staffing_requirements['Time'] = pd.to_datetime(staffing_requirements['Time'], format='%d/%m/%Y %H:%M:%S')
formatter = dates.DateFormatter('%Y-%m-%d %H:%M:%S') 

# 15 Min
staffing_requirements = staffing_requirements.groupby(pd.Grouper(freq='15T',key='Time'))['People'].max().ffill()
staffing_requirements = staffing_requirements.reset_index(level=['Time'])

staffing_requirements.index = range(1, len(staffing_requirements) + 1) 

staff_availability.set_index('Person')

staff_costs = staff_availability.set_index('Person')[['MinHours', 'MaxHours', 'HourlyWage']]
availability = staff_availability.set_index('Person')[['Availability_15min_Seg']]
availability[['first_15min', 'last_15min']] =  availability['Availability_15min_Seg'].str.split('-', expand=True).astype(int)

availability_per_member =  [pd.DataFrame(1, columns=[idx], index=range(row['first_15min'], row['last_15min']+1))
 for idx, row in availability.iterrows()]

availability_per_member = pd.concat(availability_per_member, axis='columns').fillna(0).astype(int).stack()
availability_per_member.index.names = ['Timeslot', 'Person']
availability_per_member = (availability_per_member.to_frame()
                            .join(staff_costs[['HourlyWage']])
                            .rename(columns={0: 'Available'}))


''' Generate shift times based off availability  '''

import pulp
prob = pulp.LpProblem('CreateStaffing', pulp.LpMinimize) # Minimize costs

timeslots = staffing_requirements.index
persons = availability_per_member.index.levels[1]

# A member is either staffed or is not at a certain timeslot
staffed = pulp.LpVariable.dicts("staffed",
                                   ((timeslot, staffmember) for timeslot, staffmember 
                                in availability_per_member.index),
                                 lowBound=0,
                                 cat='Binary')

# Objective = cost (= sum of hourly wages)                              
prob += pulp.lpSum(
    [staffed[timeslot, staffmember] * availability_per_member.loc[(timeslot, staffmember), 'HourlyWage'] 
    for timeslot, staffmember in availability_per_member.index]
)

# Staff the right number of people
for timeslot in timeslots:
    prob += (sum([staffed[(timeslot, person)] for person in persons]) 
    >= staffing_requirements.loc[timeslot, 'People'])


# Do not staff unavailable persons
for timeslot in timeslots:
    for person in persons:
        if availability_per_member.loc[(timeslot, person), 'Available'] == 0:
            prob += staffed[timeslot, person] == 0

# Do not underemploy people
for person in persons:
    prob += (sum([staffed[(timeslot, person)] for timeslot in timeslots])
    >= staff_costs.loc[person, 'MinHours']*4) # timeslot is 15 minutes => 4 timeslots = hour

# Do not overemploy people
for person in persons:
    prob += (sum([staffed[(timeslot, person)] for timeslot in timeslots])
    <= staff_costs.loc[person, 'MaxHours']*4) # timeslot is 15 minutes => 4 timeslots = hour

# Do not start people later than 8PM
for timeslot in timeslots:
    prob += (sum([staffed[(timeslot, person)] for person in persons])
    <= staffing_requirements.loc[person, 'Availability_Hr'] <= 52)    

# If an employee works and then stops, they can't start again
num_slots = max(timeslots)
for timeslot in timeslots:
    if timeslot < num_slots:
        for person in persons:
            prob += staffed[timeslot+1, person] <= staffed[timeslot, person] + \
                (1 - (1./num_slots) *
                 sum([staffed[(s, person)] for s in timeslots if s < timeslot]))    


prob.solve()
print(pulp.LpStatus[prob.status])

output = []
for timeslot, staffmember in staffed:
    var_output = {
        'Timeslot': timeslot,
        'Staffmember': staffmember,
        'Staffed': staffed[(timeslot, staffmember)].varValue,
    }
    output.append(var_output)
output_df = pd.DataFrame.from_records(output)#.sort_values(['timeslot', 'staffmember'])
output_df.set_index(['Timeslot', 'Staffmember'], inplace=True)
if pulp.LpStatus[prob.status] == 'Optimal':
    print(output_df)

Upvotes: 1

Views: 283

Answers (1)

kabdulla
kabdulla

Reputation: 5429

There's discussion in the comments about whether this is an OR problem or a python/pulp coding problem. I think it's a bit of both.

I don't follow how your version of the no shifts starting after 8PM code is supposed to work. This may be because I haven't seen your mathematical formulation (as suggested in the comments).

The way I would do it is as follows - you don't want shifts to start after 8PM. I'm assuming this is timeslot 52 from your attempt at this. Given that you are not allowing split shifts the easiest way I can think to apply this constraint is to say (in Pseudo code)

for each person if they have no slots before or at 8pm slot then they are not allowed any slots after 8pm.

And in code:

for person in persons:
    prob += pulp.lpSum([staffed[timeslot, person] for timeslot in timeslots if timeslot > 52]) <= \
                pulp.lpSum([staffed[timeslot, person] for timeslot in timeslots if timeslot <= 52])*30

To see how this works, consider the two cases.

Firstly for a person who has not shifts at or before 8PM (i.e. timeslot <= 52). For that person the right-hand-side of this constraint becomes <=0, so the cannot work any slots after 8PM (timeslot > 52).

On the other hand if at least one shift is scheduled before 8PM the right hand side becomes <=30 (or <= a larger number if multiple shifts exist before 8PM), so the constraint is non-active (there are only 27 possible slots after 8PM, so as long as one slot is worked before this places no constraint on post-8PM work.

Upvotes: 1

Related Questions