TheWaveLad
TheWaveLad

Reputation: 1016

Most efficient way to build cost matrix for linear sum assignment?

Suppose we want to solve a linear sum assignment with scipy and the cost for the assignment can be build from Euclidean distances.

Thus, out of m workers W=[j_1, ..., j_m] and n tasks T=[t_1, ..., t_n], the cost matrix is given by

cost_matrix = np.array([
    [np.linalg.norm(x - y) for x in W] for y in T
])

This looks computationally heavy and not very efficient. Is there a numpy/scipy way to do this better and faster?


Working example:

import numpy as np
from scipy.optimize import linear_sum_assignment

np.random.seed(0)

# define tasks 
t = np.random.rand(5)

# define workers
w = np.random.rand(3)

cost_matrix = np.array([[np.linalg.norm(x-y) for x in w] for y in t])  
>>> linear_sum_assignment(cost_matrix)
(array([1, 2, 4]), array([2, 0, 1]))

Upvotes: 1

Views: 1481

Answers (2)

Pharoah Jardin
Pharoah Jardin

Reputation: 134

This may not be the most efficient way but iteration is passed on to numpy so this may be faster:

import numpy as np
from scipy.optimize import linear_sum_assignment

np.random.seed(0)

# define tasks 
t = np.random.rand(5)

# define workers
w = np.random.rand(3)

W, T = np.meshgrid(w, t)
cost_matrix = abs(T-W)

Upvotes: 1

gnodab
gnodab

Reputation: 878

I believe what you are looking for is cdist. Scipy cdist

Y = cdist(XA, XB, 'euclidean')

Here is your example with working code:

import numpy as np
from scipy.optimize import linear_sum_assignment
from scipy.spatial.distance import cdist

np.random.seed(0)

# define tasks 
t = np.random.rand(5)

# define workers
w = np.random.rand(3)

# cost_matrix = np.array([[np.linalg.norm(x-y) for x in w] for y in t])  
cost_matrix = cdist(np.array([t]).T, np.array([w]).T, 'euclidean')
linear_sum_assignment(cost_matrix)

Upvotes: 1

Related Questions