Reputation: 4839
I am using python 2.7 with scipy to calculate a distance matrix for an array.
I don't get how to find the wanted distance values in the returned condensed matrix.
See example
from scipy.spatial.distance import pdist
import numpy as np
a = np.array([[1],[4],[0],[5]])
print a
print pdist(a)
will print
[ 3. 1. 4. 4. 1. 5.]
I found here that the ij entry in the condensed matrix should store the distance between the i and j entries where ithread wondering if they mean ij as i*j or str.join(i,j) e.g 1,2 -> 2 or 12.
I can't find a consistent way to know the wanted index.
see my example, you should expect that all of the distances from entry 0 to anywhere else will be stored in entry 0 if the first option is valid.
can anyone shed some light on how can i extract my wanted distance from entry x to entry y? which index am i looking for?
Thanks!
Upvotes: 0
Views: 821
Reputation: 440
This vector is in condensed form. It enumerates all pairs of indices in a natural order (in your example 0,1
0,2
0,3
0,4
1,2
1,3
1,4
2,3
2,4
) and yields the distance between the elements at these array entries.
There is also the squareform function, which transforms the condensed form into a square matrix form (and vice versa). The square matrix form is exactly what you expect, i.e. at entry ij (row i, column j), it stores the distance between the i-th and j-th entry. For example, if you add print squareform(d)
at the end of you code, the output will be:
array([[ 0., 3., 1., 4.],
[ 3., 0., 4., 1.],
[ 1., 4., 0., 5.],
[ 4., 1., 5., 0.]])
Upvotes: 2