kjo
kjo

Reputation: 35321

Does python have a predicate to test that row of a matrix are sorted?

I want to check that a sequence of N numpy vectors of integers is lexicographically ordered. All the vectors in the sequence have shape 1 × 2. (The value of N is big, so I want to avoid sorting this sequence if it is already sorted.)

Does Python, or numpy, already offer a predicate to perform such a test?

(It would not be hard to roll my own, but I prefer to use built-in tools if they exist.)

Upvotes: 0

Views: 337

Answers (3)

tom10
tom10

Reputation: 69242

numpy has lexsort, but this does a sort, not a test of whether the data is sorted. None-the-less, running it on sorted data is about twice as fast as unsorted.

import numpy as np
import timeit

def data(N):
    return np.random.randint(0,10,(N,2))

def get_sorted(x):
    return x[np.lexsort(x.T)]

x = data(5)
y  = get_sorted(x)

print x  # to verify lex sorting
print
print y
print

x = data(1000)
y  = get_sorted(x)

# to test the time for sorted vs unsorted data
print timeit.timeit("np.lexsort(x.T)", "from __main__ import np, x", number=1000)
print timeit.timeit("np.lexsort(y.T)", "from __main__ import np, y", number=1000)

And here are the results:

[[6 7]   # unsorted
 [4 3]
 [6 7]
 [9 2]
 [7 3]]

[[9 2]   # sorted by the second column first
 [4 3]
 [7 3]
 [6 7]
 [6 7]]

0.0788  # time to lex sort 1000x2 unsorted data values
0.0381  # time to lex sort 1000x2 pre-sorted data values

Note also that the speed of python vs numpy will depend on the list, because python can sometime short-circuit its tests. So if you think that your list will generally be unsorted, a pure python solution could figure this out in the first few values, which could be much faster; whereas numpy solutions will generally work the the entire array.

Upvotes: 1

hpaulj
hpaulj

Reputation: 231540

To have an issorted predicate you need a well defined sort, or at least a clear method of comparing items.

To follow on my question about the nature of your data. It sounds as though you have something like this:

In [130]: x=[[1,3],[3,4],[1,2],[3,1],[0,2],[6,5]]
In [131]: x1=[np.array(i).reshape(1,2) for i in x]
In [132]: x1
Out[132]: 
[array([[1, 3]]),
 array([[3, 4]]),
 array([[1, 2]]),
 array([[3, 1]]),
 array([[0, 2]]),
 array([[6, 5]])]

The Python sort is lexographic - that is, sort of the 1st element on the sublists, and then on the 2nd element.

In [137]: sorted(x)
Out[137]: [[0, 2], [1, 2], [1, 3], [3, 1], [3, 4], [6, 5]]

numpy sorts don't preserve the pairs - depending on the axis specification it sorts by column, or row (or flat). But the np.sort doc does say that complex numbers are sorted lexographically:

In [157]: xj = np.dot(x,[1,1j])
In [158]: xj
Out[158]: array([ 1.+3.j,  3.+4.j,  1.+2.j,  3.+1.j,  0.+2.j,  6.+5.j])
In [159]: np.sort(xj)
Out[159]: array([ 0.+2.j,  1.+2.j,  1.+3.j,  3.+1.j,  3.+4.j,  6.+5.j])

This matches the Python list sort.


If my guess as to your data type is correct, a comparison based test would use something like:

In [167]: [i.__lt__(j) for i,j in zip(x[:-1],x[1:])]
Out[167]: [True, False, True, False, True]
In [168]: xs=sorted(x)
In [169]: [i.__lt__(j) for i,j in zip(xs[:-1],xs[1:])]
Out[169]: [True, True, True, True, True]

That also works for the complex array:

In [173]: xjs=np.sort(xj)
In [174]: [i.__lt__(j) for i,j in zip(xjs[:-1],xjs[1:])]
Out[174]: [True, True, True, True, True]

For large lists I'd try one of the itertools for short circuiting iteration.

But when applied to the (plain) array, it is clear that the question of whether it is sorted or not needs further specification.

In [172]: [i.__lt__(j) for i,j in zip(x1[:-1],x1[1:])]
Out[172]: 
[array([[ True,  True]], dtype=bool),
 array([[False, False]], dtype=bool),
 array([[ True, False]], dtype=bool),
 array([[False,  True]], dtype=bool),
 array([[ True,  True]], dtype=bool)]

By the way, a list of (2,1) arrays would look something like this:

[np.array(i).reshape(1,2) for i in x]

[array([[1, 3]]),
 array([[3, 4]]),
 array([[1, 2]]),
 array([[3, 1]]),
 array([[0, 2]]),
 array([[6, 5]])]

which if turned into an array would have a (6,1,2) shape. Or did you want a (6,2) array?

In [179]: np.array(x)
Out[179]: 
array([[1, 3],
       [3, 4],
       [1, 2],
       [3, 1],
       [0, 2],
       [6, 5]])

Upvotes: 1

tillsten
tillsten

Reputation: 14878

You can use np.diff and np.any:

A = np.array([[1,2,3], [2,3,1], [3, 4, 5]])
diff =  np.diff(A, axis=0)
print np.all(diff>=0, axis=0)

Upvotes: 1

Related Questions