Reputation: 1016
Suppose we want to solve a linear sum assignment with scipy and the cost for the assignment can be build from Euclidean distances.
Thus, out of m
workers W=[j_1, ..., j_m]
and n
tasks T=[t_1, ..., t_n]
, the cost matrix is given by
cost_matrix = np.array([
[np.linalg.norm(x - y) for x in W] for y in T
])
This looks computationally heavy and not very efficient. Is there a numpy/scipy way to do this better and faster?
Working example:
import numpy as np
from scipy.optimize import linear_sum_assignment
np.random.seed(0)
# define tasks
t = np.random.rand(5)
# define workers
w = np.random.rand(3)
cost_matrix = np.array([[np.linalg.norm(x-y) for x in w] for y in t])
>>> linear_sum_assignment(cost_matrix)
(array([1, 2, 4]), array([2, 0, 1]))
Upvotes: 1
Views: 1481
Reputation: 134
This may not be the most efficient way but iteration is passed on to numpy so this may be faster:
import numpy as np
from scipy.optimize import linear_sum_assignment
np.random.seed(0)
# define tasks
t = np.random.rand(5)
# define workers
w = np.random.rand(3)
W, T = np.meshgrid(w, t)
cost_matrix = abs(T-W)
Upvotes: 1
Reputation: 878
I believe what you are looking for is cdist
. Scipy cdist
Y = cdist(XA, XB, 'euclidean')
Here is your example with working code:
import numpy as np
from scipy.optimize import linear_sum_assignment
from scipy.spatial.distance import cdist
np.random.seed(0)
# define tasks
t = np.random.rand(5)
# define workers
w = np.random.rand(3)
# cost_matrix = np.array([[np.linalg.norm(x-y) for x in w] for y in t])
cost_matrix = cdist(np.array([t]).T, np.array([w]).T, 'euclidean')
linear_sum_assignment(cost_matrix)
Upvotes: 1