Anupam Tripathi
Anupam Tripathi

Reputation: 73

Torch.sort and argsort sorting randomly in case of same element

When same elements are encountered, torch.sort and argsort sort the tensor in random manner. This is not the case in numpy. I have a list of elements already sorted according to the second column and now i want to sort it using the first column but preserve the earlier sort in case of tie in the new sorting.

import torch

a = torch.tensor(
        [[ 0., 3.],
        [ 2., 3.],
        [ 2., 2.],
        [10., 2.],
        [ 0., 2.],
        [ 6., 2.],
        [10., 1.],
        [ 2., 1.],
        [ 0., 1.],
        [ 6., 1.],
        [10., 0.],
        [12., 0.]]
)
print(a[torch.argsort(a[:, 0])])

Output:

tensor([[ 0.,  3.],
        [ 0.,  2.],
        [ 0.,  1.],
        [ 2.,  1.],
        [ 2.,  2.],
        [ 2.,  3.],
        [ 6.,  1.],
        [ 6.,  2.],
        [10.,  1.],
        [10.,  2.],
        [10.,  0.],
        [12.,  0.]])

Numpy:

import numpy as np

a = np.array(
        [[ 0., 3.],
        [ 2., 3.],
        [ 2., 2.],
        [10., 2.],
        [ 0., 2.],
        [ 6., 2.],
        [10., 1.],
        [ 2., 1.],
        [ 0., 1.],
        [ 6., 1.],
        [10., 0.],
        [12., 0.]]
)
print(a[np.argsort(a[:, 0])])

Output:

[[ 0.  3.]
 [ 0.  2.]
 [ 0.  1.]
 [ 2.  3.]
 [ 2.  2.]
 [ 2.  1.]
 [ 6.  2.]
 [ 6.  1.]
 [10.  2.]
 [10.  1.]
 [10.  0.]
 [12.  0.]]

What could be the reason for this? And what can I do to avoid it?

Upvotes: 3

Views: 1715

Answers (1)

LudvigH
LudvigH

Reputation: 4793

As per torch 1.9.0 you can run the sort with option stable=True. See https://pytorch.org/docs/1.9.0/generated/torch.sort.html?highlight=sort#torch.sort

>>> x = torch.tensor([0, 1] * 9)
>>> x.sort()
torch.return_types.sort(
    values=tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]),
    indices=tensor([ 2, 16,  4,  6, 14,  8,  0, 10, 12,  9, 17, 15, 13, 11,  7,  5,  3,  1]))
>>> x.sort(stable=True)
torch.return_types.sort(
    values=tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]),
    indices=tensor([ 0,  2,  4,  6,  8, 10, 12, 14, 16,  1,  3,  5,  7,  9, 11, 13, 15, 17]))

The documentation says this is only on the CPU, but it will come to GPU sorting soon, since that documentation warning has been removed in the master branch at github (as per https://github.com/pytorch/pytorch/pull/61685)

Upvotes: 2

Related Questions