Reputation: 13
I wonder if there are very simple ways to calculate pairwise subtraction for two elements in a multi-dimensional array which is consisted of vectors USING a function in NUMPY or SCIPY library.
Let me introduce an example:
>>> a = (np.arange(9)).reshape((3,3)) # a list of 3 vectors (x, y, z)
>>> a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
I want to get the following:
>>>> result
array([[3,3,3],
[6,6,6],
[3,3,3]])
# first element comes from [3,4,5] - [0,1,2]
# second element comes from [6,7,8] - [0,1,2]
# third element comes from [6,7,8] - [3,4,5]
I do not matter about the signs (+/-) on the result which depends on the order of subtraction of two vectors. But, I want to know VERY SIMPLE version of code using pre-defined functions in Scipy or Numpy libraries such as scipy.spatial.distance.pdist.
I do need loop codes to iterate elements-wise for result, instead, I need just one line to get the result.
Upvotes: 1
Views: 1649
Reputation: 221524
Approach #1
Get the pairwise indices with np.triu_indices
, index into rows of a
with those and compute the differences -
In [8]: r,c = np.triu_indices(len(a),1)
In [9]: a[c] - a[r]
Out[9]:
array([[3, 3, 3],
[6, 6, 6],
[3, 3, 3]])
Approach #2
We can also use slicing that avoids creating the indices and the indexing part itself that creates copies of the input array for the slices needed for subtraction. Thus, we would work with views only, but we need to iterate in that process. The advantage of slicing shows off for large arrays on performance, as we would verify later on in timings. The implementation would be -
n = len(a)
N = n*(n-1)//2
idx = np.concatenate(( [0], np.arange(n-1,0,-1).cumsum() ))
start, stop = idx[:-1], idx[1:]
out = np.empty((N,a.shape[1]),dtype=a.dtype)
for j,i in enumerate(range(n-1)):
out[start[j]:stop[j]] = a[i+1:] - a[i,None]
Approaches as funcs -
def pairwise_row_diff_triu_indices(a):
r,c = np.triu_indices(len(a),1)
out = a[c] - a[r]
return out
def pairwise_row_diff_slicing(a):
n = len(a)
N = n*(n-1)//2
idx = np.concatenate(( [0], np.arange(n-1,0,-1).cumsum() ))
start, stop = idx[:-1], idx[1:]
out = np.empty((N,a.shape[1]),dtype=a.dtype)
for j,i in enumerate(range(n-1)):
out[start[j]:stop[j]] = a[i+1:] - a[i,None]
return out
Timings -
In [53]: np.random.seed(0)
In [54]: a = np.random.randint(0,9,(1000,3))
In [55]: %timeit pairwise_row_diff_triu_indices(a)
...: %timeit pairwise_row_diff_slicing(a)
10 loops, best of 3: 21 ms per loop
100 loops, best of 3: 6.01 ms per loop
In [56]: a = np.random.randint(0,9,(5000,3))
In [57]: %timeit pairwise_row_diff_triu_indices(a)
...: %timeit pairwise_row_diff_slicing(a)
1 loop, best of 3: 456 ms per loop
10 loops, best of 3: 110 ms per loop
Upvotes: 3
Reputation: 698
This is not using numpy or scipy functions...but it is a simple solution
length = a.shape[1]
new_arr = []
for ii in range(length):
for jj in range(ii+1,length):
new_arr.append(a[ii,]-a[jj,])
Upvotes: 0