Reputation: 334
I've got the following code:
cases = []
for file in files:
# Get value from files and write to data
data = [ id, b, c, d, e, f, g, h, i, j, k ]
# Append the values to the data list
cases.append(data)
# Sort the cases descending
cases.sort(reverse=True)
After running the for
loop the cases
list looks like this:
cases = [ ['id', val, val], ['id', val, val], ['id', val, val] ] etc.
id
is a value like '600', '900', '1009', '1009a' or '1010' which I want to sort descending.
At the moment '1009a' is on top of the list while I want it to be between '1009' and '1010'. This is probably related to '1009a' being parsed as unicode
while the other values are being parsed as long
. A debugger also confirms this.
I've tried converting the id
field to unicode using unicode(id)
while writing the data
list, but this does not give the desired result either. After sorting cases
, output will start at '999', until reaching '600' and then will start at '1130' and run down to '1000'. Instead of starting at '1130' and running down to '600'. Which is what i want, with '1009a' being between '1009' and '1010'.
Upvotes: 2
Views: 122
Reputation: 15204
Same principle as the one used be @Tobias_k but not quite as neat.
from itertools import takewhile, dropwhile
cases = [ ['600', 'foo1', 'bar1'], ['900', 'foo2', 'bar2'], ['1009', 'foo6', 'bar6'], ['1009a', 'foo3', 'bar3'], ['1010', 'foo4', 'bar4'] ]
def sorter_helper(str_):
n = ''.join(takewhile(lambda x: x.isnumeric(), str_))
s = ''.join(dropwhile(lambda x: x.isnumeric(), str_))
return (int(n), s)
cases = sorted(cases, key=lambda x: sorter_helper(x[0]))
print(cases) # -> [['600', 'foo1', 'bar1'], ['900', 'foo2', 'bar2'], ['1009', 'foo6', 'bar6'], ['1009a', 'foo3', 'bar3'], ['1010', 'foo4', 'bar4']]
Upvotes: 0
Reputation: 462
Your problem is that when you are in unicode
, you do have 9>1
and so 900>1000
as it compares from the first value.
What you need to do is write leading zeros for all your id
fields so that 900
becomes 0900
and is now less than 1000
. You can do this with this bit of code (although there are probably neater ways of doing it):
id = str(id).zfill(5)
Note that you don't need the str()
bit if id
is already a string. Here the zfill(5)
will add zeros to the left of the string until the string is of length 5.
Upvotes: 0
Reputation: 82889
If you are comparing strings containing numbers, those are sorted in alphabetic order, i.e. without regarding how many digits the number has. You have to convert those to int
first, but that's tricky with the a/b
suffix. You can use a regular expression to separate the number and the suffix:
>>> p = re.compile(r"(\d+)(.*)")
>>> def comp(x):
... n, s = p.match(x).groups()
... return int(n), s
...
>>> ids = ["1009", "1009a", "1009b", "1010", "99"]
>>> [comp(x) for x in ids]
[(1009, ''), (1009, 'a'), (1009, 'b'), (1010, ''), (99, '')]
>>> sorted(ids, key=comp)
['99', '1009', '1009a', '1009b', '1010']
Applying this to your example, you probably need this (not tested):
cases.sort(key=lambda x: comp(x[0]), reverse=True)
Upvotes: 4