Reputation: 163
I am trying to find/develop an algorithm that can minimise total time for scheduling jobs (with running times) to machines. I.E the standard multiprocesser scheduling algorithm.
However I have two additional constraints: Some jobs must be completed before others (precedence). Only certain machines can run certain jobs (job exclusions). Jobs are not unique to processers though.
For example I may have 4 jobs such that -
(cost): j1 (4) , j2 (2) , j3 (10), j4 (12)
3 machines: m1, m2, m3
Such that:
m1 can run j1, j2, j3
m2 can run j1, j2 and
m3 can run j3, j4
j1 must be run before j2 and j4
I aim to find the optimal schedule for each machine in order to minimise total time.
I have looked into using an multiprocesser scheduling algorithm for unrelated machines (setting the excluded tasks to infinite cost) but this isn't ideal as the costs aren't actually variable (either infinite or the normal cost).
Using a DAG should work well for the precedence constraint but I can't work out how to include the exclusions.
At the moment I am using a greedy algorithm that maximises the cost of the freed (no longer constrained) jobs in all other processors. However I have no idea if this is optimal or even good.
I am fairly certain this is NP-hard so I don't assume an optimal algorithm exists. It may be that using multiprocesser scheduling is the issue here as it may be better to use knapsacking or minimal path algorithms.
Help with both constraints would be preferred but literature on dealing with excluded jobs is minimal (according to some googling) so advice on just dealing with this would be beneficial.
Upvotes: 0
Views: 724
Reputation: 11014
You can use OR-Tools CP-SAT Solver for this.
See
https://developers.google.com/optimization/scheduling/job_shop
in the case each task can only be executed on one machine, or
https://github.com/google/or-tools/blob/master/examples/python/flexible_job_shop_sat.py
if each task can have alternative machines where exactly one must be chosen.
Upvotes: 1
Reputation: 16724
One way to handle this is as follows.
Introduce a binary variable assign(j,m)
to indicate if we assign a job j to machine m. Then just fix all variables assign(j,m)=0
for all combinations that are not allowed. A good MIP solver will remove the corresponding variables from the model in the presolve phase.
When I do this, and complete the MIP formulation and solve it, I see:
---- 63 --------------- data ---
---- 63 SET ok allowed job/machine combinations
machine1 machine2 machine3
job1 YES YES
job2 YES YES
job3 YES YES
job4 YES
---- 63 PARAMETER proctime processing time
job1 4.000, job2 2.000, job3 10.000, job4 12.000
---- 63 SET prec precedence
job2 job4
job1 YES YES
---- 63 --------------- solution -----
---- 63 VARIABLE assign.L assign job to machine
machine1 machine2 machine3
job1 1.000
job2 1.000
job3 1.000
job4 1.000
---- 63 VARIABLE start.L job start time
job2 4.000, job4 4.000
---- 63 VARIABLE finish.L job finish time
job1 4.000, job2 6.000, job3 10.000, job4 16.000
---- 63 VARIABLE makespan.L = 16.000 time last job is finished
(Note that start times of zero are not printed)
The complete model is here.
Upvotes: 2
Reputation: 37232
I aim to find the optimal schedule for each machine in order to minimise total time.
Don't.
Outside of hypothetical textbook scenarios; a scheduler typically has no idea when more jobs will be started, how long any job might take, when any computer (or CPU) will change speed, when/if there will be hardware failures, if/when a job will be cancelled (or crash), ...
The only sane/practical approach is to forget about "optimal in theory for a fictional scenario" and dynamically adapt to changing/"unknowable in advance" conditions, while making decisions that are "good enough" because you can't afford to waste CPU time trying to find a "perfect for a fraction of a millisecond maybe but now it's too late to matter because it took too long to decide" solution.
Mostly, there are extremely frequent "events" that effect scheduling (jobs starting, stopping, waiting for IO, continuing because the IO happened, power management changes, etc); and the scheduler has to react to these events when they occur, in a fast/efficient manner.
If some jobs must be completed before others, then that not the scheduler's problem - it's a problem that whoever wrote the code for those tasks has to deal with (adding a "wait for the other job to complete before I continue" where it's appropriate - e.g. maybe a waitpid()
).
If only certain machines can run certain jobs, then that should be trivial (you only need some kind of "affinity mask" to control where the job is able to run).
Upvotes: 0