Reputation: 1035
I would like to compute all possible pairwise differences (without repetition) between the columns of a matrix. What's an efficient / pythonic way to do this?
mat = np.random.normal(size=(10, 3))
mat
array([[ 1.57921282, 0.76743473, -0.46947439],
[ 0.54256004, -0.46341769, -0.46572975],
[ 0.24196227, -1.91328024, -1.72491783],
[-0.56228753, -1.01283112, 0.31424733],
[-0.90802408, -1.4123037 , 1.46564877],
[-0.2257763 , 0.0675282 , -1.42474819],
[-0.54438272, 0.11092259, -1.15099358],
[ 0.37569802, -0.60063869, -0.29169375],
[-0.60170661, 1.85227818, -0.01349722],
[-1.05771093, 0.82254491, -1.22084365]])
In this matrix there are 3 pairwise differences (N choose k unique combinations, where order doesn't matter).
pair_a = mat[:, 0] - mat[:, 1]
pair_b = mat[:, 0] - mat[:, 2]
pair_c = mat[:, 1] - mat[:, 2]
is one (ugly) way. You can easily imagine using nested for
loops for larger matrices, but I am hoping there's a nicer way.
I would like the result to be another matrix, with scipy.misc.comb(mat.shape[1], 2)
columns and mat.shape[0]
rows.
Upvotes: 0
Views: 121
Reputation: 1035
Incidentally, here is the solution I came up with. Much less elegant than moarningsun's.
def pair_diffs(mat):
n_pairs = int(sp.misc.comb(mat.shape[1], 2))
pairs = np.empty([mat.shape[0], n_pairs])
this_pair = 0
# compute all differences:
for i in np.arange(mat.shape[1]-1):
for j in np.arange(i+1, mat.shape[1]):
pairs[:, this_pair] = mat[:, i] - mat[:, j]
this_pair += 1
return pairs
Upvotes: 0
Reputation:
Combinations of length 2 can be found using the following trick:
N = mat.shape[1]
I, J = np.triu_indices(N, 1)
result = mat[:,I] - mat[:,J]
Upvotes: 5
Reputation: 26040
In [7]: arr = np.arange(m*n).reshape((m, n))
In [8]: arr
Out[8]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
In [9]: from itertools import combinations
In [10]: def diffs(arr):
....: arr = np.asarray(arr)
....: n = arr.shape[1]
....: for i, j in combinations(range(n), 2):
....: yield arr[:, i] - arr[:, j]
....:
In [11]: for x in diffs(arr): print x
[-1 -1 -1 -1 -1]
[-2 -2 -2 -2 -2]
[-3 -3 -3 -3 -3]
[-1 -1 -1 -1 -1]
[-2 -2 -2 -2 -2]
[-1 -1 -1 -1 -1]
If you need them in an array, then just preallocate the array and assign the rows (or columns, as desired).
Upvotes: 1