Modified assignment problem (more tasks than agents)

Assume that N is the number of agents and M is the number of tasks. The number of tasks is greater than number of agents, i.e. M > N. Each agent must have at least one task. Given rectangular matrix of costs, find the optimal solution (i.e. assign each task to exactly one agent so each agent has at least one task and the cost is minimized).

What effective algorithm could solve this problem?

I've tried to implement naive recursive algorithm with memorization, but it is too slow for values of M over 1000. I know about Hungarian method, but I wasn't able to use the algorithm with my constraint (each agent must have at least one task).

Upvotes: 1

Answers (2)

Erwin Kalvelagen

Reputation: 16724

LP/MIP Model

Here is a simple optimization model:

# sets
i : tasks
j : agents

# model
min sum((i,j), c[i,j]*x[i,j])
sum(j, x[i,j]) = 1  ∀i  "assign each task to an agent"
sum(i, x[i,j]) >= 1 ∀j  "each agent needs to do at least one task"
x[i,j] ∈ {0,1}

These models tend to solve very easily.

Here is a small test script:

import cvxpy as cp
import numpy as np

NTASK = 1000
NAGENT = 200

# random data
np.random.seed(123)
c = np.random.uniform(1,100,(NTASK,NAGENT))

# model
x = cp.Variable((NTASK,NAGENT),boolean=True)
prob = cp.Problem(cp.Minimize(cp.sum(cp.multiply(c,x))),
                  [cp.sum(x,axis=1) == 1, 
                   cp.sum(x,axis=0) >= 1])

# solve and print results
res = prob.solve(verbose=True)
print(prob.status)
print(prob.value)
# print(x.value)

If you prefer, this model can be solved as an LP by relaxing the x variable to x[i,j] ∈ [0,1].

Network Model

This can also be modeled as a simple min-cost flow network problem.

The data for this network would be:

Node        Supply          Arc         Cost      Capacity
Task nodes     1      task→worker      c[i,j]        1
Worker nodes  -1      worker→sink        0          n-m 
Sink node   −(n−m)

Upvotes: 3

btilly

Reputation: 46399

This can be formulated as a Minimum Cost Maximum Flow Problem.

Start with a sink. It connects to the tasks along channels of cost 0 and flow 1. Each task connects to the agents along channels of costs from your matrix and flow 1. Each agent connects to the sink with a channel of flow 1 and cost 0. Agents are also connected to a pool with channels of flow M-N and high cost. And the pool is connected to the sink with a channel of flow M-N and similarly high cost.

The maximum flow with minimum cost will have a flow of M from the source to the tasks. It will then have a flow of M from the source to the agents. A flow of N will take the cheap and narrow pipes to the sink directly from the agents. And the remaining M-N will take the expensive route to the pool, and from the pool to the sink. Because the flow from agents to pool and back again is expensive, there will be only a minimum of flow to the pool, and no flow from the pool back to the agents.

Therefore the maximum flow will be an answer to your problem with each agent getting the flow from at least one task.

Upvotes: 6

Modified assignment problem (more tasks than agents)

Answers (2)

LP/MIP Model

Network Model

Related Questions