stefan
stefan

Reputation: 19

lambda sorted list strange behavior

when I sort by first and second element in the list(tuple), it works, but sorting by third element is not working any more the input:

mylist=[('11', '82075.36', '8.15'), ('16', '82073.78', '12.92'),   ('13', '62077.99', '17.89'),]

the request:

print(sorted(mylist, key=lambda val: val[0]))
print(sorted(mylist, key=lambda val: val[1]))
print(sorted(mylist, key=lambda val: val[2]))

and the output:

[('11', '82075.36', '8.15'), ('13', '62077.99', '17.89'), ('16', '82073.78', '12.92')] # it is OK

[('13', '62077.99', '17.89'), ('16', '82073.78', '12.92'), ('11',  '82075.36', '8.15')] # it is OK`

[('16', '82073.78', '12.92'), ('13', '62077.99', '17.89'), ('11', '82075.36', '8.15')] # this seems to be not correct, can anybody explain why?

if I remove the quotes from the third elm it works, anyhow, for elm 1 it works without removing the quotes

mylist=[('11', '82075.36', 8.15), ('16', '82073.78', 12.92), ('13', '62077.99', 17.89),]

and the output:

[('11', '82075.36', 8.15), ('16', '82073.78', 12.92), ('13', '62077.99', 17.89)]

Upvotes: 1

Views: 75

Answers (3)

aghast
aghast

Reputation: 15310

This is a common problem for beginners. You are sorting "ascii-betically" instead of numerically.

That is, (1, 2, 3) and ("1", "2", "3") are different, because one is a tuple of integers while the other is a tuple of strings.

The ASCII code (and Unicode, etc.) is defined in such a way that 0, 1, 2, ... 9 are all in "increasing" order. This means that "foo1" and "foo2" will sort, as strings, into what seems like the right sequence.

But sorting a number as a string fails when the numbers are of different lengths. For example, which one is greater, "9" or "10"? Well, as a string, "1" comes before "9", so the sequence is "10", "9". But as integers, it would be 9, 10.

This is why people make filenames with numbers in them always use leading zeroes. Because filenames get sorted as strings, so "009" and "100" show up in the correct order, but "9" and "100" don't!

So, to answer your original question, your first two sets of data have numbers that are consistently the same length, including the length of the parts before the decimal. The third column has "8.15" and "12.92", and that right there will cause it to fail, as explained above.

Upvotes: 0

chepner
chepner

Reputation: 531055

Lexicographical order is different from numerical order. '70' < '9' is true, but 70 < 9 is false. If you want to compare string items as numbers, you have to do so explicitly.

print(sorted(mylist, key=lambda val: int(val[0])))

Upvotes: 0

Gil Hamilton
Gil Hamilton

Reputation: 12347

You're asking for the strings to be sorted, so you're getting string sorting. Try this instead:

print(sorted(mylist, key=lambda val: int(val[0])))
print(sorted(mylist, key=lambda val: float(val[1])))
print(sorted(mylist, key=lambda val: float(val[2])))

I.e. change the sort keys to numerics.

Upvotes: 1

Related Questions