Mike El Jackson
Mike El Jackson

Reputation: 755

Numpy array of distances to list of (row,col,distance)

I have an nd array that looks as follows:

[[ 0.          1.73205081  6.40312424  7.21110255  2.44948974]
 [ 1.73205081  0.          5.09901951  5.91607978  1.        ]
 [ 6.40312424  5.09901951  0.          1.          4.35889894]
 [ 7.21110255  5.91607978  1.          0.          5.09901951]
 [ 2.44948974  1.          4.35889894  5.09901951  0.        ]]

Each element in this array is a distance and I need to turn this into a list with the row,col,distance as follows:

l = [(0,0,0),(0,1, 1.73205081),(0,2, 6.40312424),...,(1,0, 1.73205081),(1,1,0),...,(4,4,0)] 

Additionally, it would be cool to remove the diagonal elements and also the elements (j,i) as (i,j) are already there. Essentially, is it possible to take just the top triangular matrix of this?

Is this possible to do efficiently (without a lot of loops)? I had created this array with squareform, but couldn't find any docs to do this.

Upvotes: 0

Views: 326

Answers (4)

Daniel F
Daniel F

Reputation: 14399

Do you really want the top triangular matrix for an [nxm] matrix where n>m? That will give you (nxn-n)/2 elements and lose all the data where m⊖n.

What you probably want is the lower triangular matrix:

def tri_reduce(m):
    n=m.shape
    if n[0]>n[1]:
        i=np.tril_indices(n[0],1,n[1])
    else:
        i=np.triu_indices(n[0],1,n[1])
    return np.vstack((i,m[i])).T

Rebuilding it into a list of tuples would require a loop though I believe. list(tri_reduce(m)) would give a list of nd arrays.

Upvotes: 0

hpaulj
hpaulj

Reputation: 231355

squareform does all this. Read the docs and experiment. It works in both directions. If you give it a matrix it returns the upper triangle values (condensed form). If you give it those values, it returns the matrix.

In [668]: M
Out[668]: 
array([[ 0. ,  0.1,  0.5,  0.2],
       [ 0.1,  0. ,  2. ,  0.3],
       [ 0.5,  2. ,  0. ,  0.2],
       [ 0.2,  0.3,  0.2,  0. ]])
In [669]: spatial.distance.squareform(M)
Out[669]: array([ 0.1,  0.5,  0.2,  2. ,  0.3,  0.2])
In [670]: v=spatial.distance.squareform(M)
In [671]: v
Out[671]: array([ 0.1,  0.5,  0.2,  2. ,  0.3,  0.2])
In [672]: spatial.distance.squareform(v)
Out[672]: 
array([[ 0. ,  0.1,  0.5,  0.2],
       [ 0.1,  0. ,  2. ,  0.3],
       [ 0.5,  2. ,  0. ,  0.2],
       [ 0.2,  0.3,  0.2,  0. ]])

You can also specify a force and checks parameter, but without those it just goes by the shape.

Indicies can come from triu

In [677]: np.triu_indices(4,1)
Out[677]: 
(array([0, 0, 0, 1, 1, 2], dtype=int32),
 array([1, 2, 3, 2, 3, 3], dtype=int32))

In [680]: np.vstack((np.triu_indices(4,1),v)).T
Out[680]: 
array([[ 0. ,  1. ,  0.1],
       [ 0. ,  2. ,  0.5],
       [ 0. ,  3. ,  0.2],
       [ 1. ,  2. ,  2. ],
       [ 1. ,  3. ,  0.3],
       [ 2. ,  3. ,  0.2]])

Just to check, we can fill in a 4x4 matrix with these values

In [686]: A=np.vstack((np.triu_indices(4,1),v)).T
In [687]: MM = np.zeros((4,4))
In [688]: MM[A[:,0].astype(int),A[:,1].astype(int)]=A[:,2]
In [689]: MM
Out[689]: 
array([[ 0. ,  0.1,  0.5,  0.2],
       [ 0. ,  0. ,  2. ,  0.3],
       [ 0. ,  0. ,  0. ,  0.2],
       [ 0. ,  0. ,  0. ,  0. ]])

Those triu indices can also fetch the values from M:

In [693]: I,J = np.triu_indices(4,1)
In [694]: M[I,J]
Out[694]: array([ 0.1,  0.5,  0.2,  2. ,  0.3,  0.2])

squareform uses compiled code in spatial.distance._distance_wrap so I expect it will be quite fast for large arrays. Only problem it just returns the condensed form values, but not the indices. But given the shape,the indices can always be calculated. They don't need to be stored with the values.

Upvotes: 5

rajeshcis
rajeshcis

Reputation: 382

you can try this,

print([(x,y, value) for (x,y), value in np.ndenumerate(numpymatrixarray)])

output [(0, 0, 0.0), (0, 1, 1.7320508100000001), (0, 2, 6.4031242400000004), (0, 3, 7.2111025499999997), (0, 4, 2.4494897400000002), (1, 0, 1.7320508100000001), (1, 1, 0.0), (1, 2, 5.0990195099999998), (1, 3, 5.9160797799999996), (1, 4, 1.0), (2, 0, 6.4031242400000004), (2, 1, 5.0990195099999998), (2, 2, 0.0), (2, 3, 1.0), (2, 4, 4.3588989400000004), (3, 0, 7.2111025499999997), (3, 1, 5.9160797799999996), (3, 2, 1.0), (3, 3, 0.0), (3, 4, 5.0990195099999998), (4, 0, 2.4494897400000002), (4, 1, 1.0), (4, 2, 4.3588989400000004), (4, 3, 5.0990195099999998), (4, 4, 0.0)]

Upvotes: 0

John Zwinck
John Zwinck

Reputation: 249123

If your input is x, first generate the indices:

i0,i1 = np.indices(x.shape)

Then:

np.concatenate((i1,i0,x)).reshape(3,5,5).T

That gives you the first result--for the entire matrix.

As for taking only the upper triangle, you might considering trying np.triu() but I'm not sure exactly what result you're looking for. You can probably figure out how to mask the parts you don't want now though.

Upvotes: 2

Related Questions