Reputation: 133
Suppose I have a bunch of students and a couple of professors. Each student needs to take an oral exam with one of the professors. For this, every student supplies an (ordered) list consisting of three professors he would prefer taking the exam with. Of course, each professor can only give a limited number of exams. I can use the Kuhn-Munkres algorithm to compute an assignment where as many students as possible are assigned to their first wish professor.
Now suppose instead that every student has to take two exams, and accordingly supplies two wish lists. Again I want to assign as many students as possible to their first wish professors, but subject to the constraint that a student must not take both his exams with the same professor.
Is there some algorithm for computing an optimal assignment efficiently? Maybe a generalization of the Kuhn-Munkres algorithm? Or is this new problem already NP-hard? What hard problem could one try to reduce to this one?
I should emphasize that I am looking for an exact solution. It is pretty easy to think of some heuristics, like considering one type of exam after the other.
Upvotes: 3
Views: 2326
Reputation: 33509
I think you can solve this optimally by solving a min cost flow problem.
The Python code below illustrates how this can be done for 3 students using the networkx library.
import networkx as nx
G=nx.DiGraph()
multiple_prefs=[{'Tom':['Prof1','Prof2','Prof3'],
'Dick':['Prof2','Prof1','Prof3'],
'Harry':['Prof1','Prof3','Prof1']},
{'Tom':['Prof2','Prof1','Prof3'],
'Dick':['Prof2','Prof1','Prof3'],
'Harry':['Prof1','Prof3','Prof1']}
]
workload={'Prof1':2,'Prof2':10,'Prof3':4}
num_tests=len(multiple_prefs)
num_students=len(multiple_prefs[0])
G.add_node('dest',demand=num_tests*num_students)
A=[]
for i,prefs in enumerate(multiple_prefs):
testname='_test%d'%i
for student,proflist in prefs.items():
node=student+testname
A.append(node)
G.add_node(node,demand=-1)
for i,prof in enumerate(proflist):
if i==0:
cost=0 # happy to assign first choices
elif i==1:
cost=1 # Slightly unhappy to assign second choices
else:
cost=4 # Very unhappy to assign third choices
n2=prof+'_with_'+student
n3=prof
G.add_edge(node,n2,capacity=1,weight=cost) # Edge taken if student takes this test with the professor
G.add_edge(n2,n3,capacity=1,weight=0) # Capacity stops student taking both tests with the same professor
G.add_edge(n3,'dest',capacity=workload[prof],weight=0) # Capacity stops professor exceeding his workload
flowdict = nx.min_cost_flow(G)
for student in A:
for prof,flow in flowdict[student].items():
if flow:
print student,'with',prof
The key is that we have a separate node (called n2 in the code) for each combination of student and professor. By giving n2 a capacity of 1 to the destination, we ensure that each student can only take a single test with that professor.
The dictionary multiple_prefs stores two dictionaries. The first dictionary contains the lists of preferences for each student for the first test. The second has the preferences for the second test.
EDIT
The code now also includes nodes n3 for each professor. The capacity on the edges between these nodes and the dest will limit the maximum workload for each professor.
The dictionary workload stores the maximum number of allowed tests for each professor.
Upvotes: 1
Reputation: 22496
You can model this exactly using Integer programming as an Assignment Problem.
Variables
Let there be s students and p professors.
The decision variable
X_sp is 1 if student s takes an exam with professor p
0 otherwise
Let Limit_p be the number of exams that professor p can handle.
Handling Student Preferences
We can handle each student's preference through the costs (objective)
Let C_sp be the 'cost' of assigning student s to take an exam with professor p.
We keep progressively making the costs higher as the preference decreases.
C_sp = 1 if p is the *first* preference of s
C_sp = 2 if p is the *second* preference of s
C_sp = 3 if p is the *third* preference of s
...
C_sp = n if p is the *n th* preference of s
Formulation
Case 1: One Exam
Min (sum over s)(sum over p) C_sp X_sp
subject to:
(sum over p) X_sp = 1 for each student s # Every student must take an exam
(sum over s) X_sp <= Limit_p for each professor p # Every prof can only handle so many exams
X_sp = {0,1} # binary
Case 2: student takes two Exams
Min (sum over s)(sum over p) C_sp X_sp
subject to:
(sum over p) X_sp = 2 for each student s # Every student must take an exam
(sum over s) X_sp <= Limit_p for each professor p # Every prof can only handle so many exams
X_sp = {0,1} # binary
Taking care of no student can take both exams with the same professor
Normally, we'd have to introduce indicator variables Y_sp
to denote whether or not a student has taken an exam with professor p. However, in this case we don't even have to do that. The fact that X_sp
is binary will ensure that no student ends up taking 2 exams with the same professor.
If the student has a preference of which exam they'd like to take with a professor, then adding a subscript e
to the decision variable is required. X_spe
You can solve this either through the Hungarian method, or easier still, any available implementation of LP/IP solver.
The Hungarian algorithm's complexity is O(n^3), and since we haven't introduced any side constraints to spoil the integrality property, the linear solution will always be integral.
Update (A little bit of theory)
Why are the solutions integral? To understand why the solutions are guaranteed to be integral, a little bit of theory is necessary.
Typically, when confronted with an IP, the first thought is to try solving the linear relaxation (Forget the integral values requirements and try.) This will give a lower bound to minimization problems. There are usually a few so-called complicating constraints that make integer solutions very difficult to find. One solution technique is to relax those by throwing them to the objective function, and solving the simpler problem (the Lagrangian Relaxation problem.)
Now, if the solution to the Lagrangian doesn't change whether or not you have integrality (as is the case with Assignment Problems) then you always get integer solutions.
For those who are really interested in learning about Integrality and Lagrangian duals, this reference is quite readable. The example in Section 3 specifically covers the Assignment Problem.
Free MILP Solvers: This SO question is quite exhaustive for that.
Hope that helps.
Upvotes: 2