Most efficient way to build cost matrix for linear sum assignment?

Question

Suppose we want to solve a linear sum assignment with scipy and the cost for the assignment can be build from Euclidean distances.

Thus, out of m workers W=[j_1, ..., j_m] and n tasks T=[t_1, ..., t_n], the cost matrix is given by

cost_matrix = np.array([
    [np.linalg.norm(x - y) for x in W] for y in T
])

This looks computationally heavy and not very efficient. Is there a numpy/scipy way to do this better and faster?

Working example:

import numpy as np
from scipy.optimize import linear_sum_assignment

np.random.seed(0)

# define tasks 
t = np.random.rand(5)

# define workers
w = np.random.rand(3)

cost_matrix = np.array([[np.linalg.norm(x-y) for x in w] for y in t])

>>> linear_sum_assignment(cost_matrix)
(array([1, 2, 4]), array([2, 0, 1]))

gnodab · Accepted Answer

I believe what you are looking for is cdist. Scipy cdist

Y = cdist(XA, XB, 'euclidean')

Here is your example with working code:

import numpy as np
from scipy.optimize import linear_sum_assignment
from scipy.spatial.distance import cdist

np.random.seed(0)

# define tasks 
t = np.random.rand(5)

# define workers
w = np.random.rand(3)

# cost_matrix = np.array([[np.linalg.norm(x-y) for x in w] for y in t])  
cost_matrix = cdist(np.array([t]).T, np.array([w]).T, 'euclidean')
linear_sum_assignment(cost_matrix)

Most efficient way to build cost matrix for linear sum assignment?

Answers (2)

Related Questions