Reputation: 1030
I have a list of strings that need to be sorted in numerical order using as a int key two substrings.
Obviously using the sort()
function orders my strings alphabetically so I get 1,10,2... that is obviously not what I'm looking for.
Searching around I found a key parameter can be passed to the sort()
function, and using sort(key=int)
should do the trick, but being my key a substring and not the whole string should lead to a cast error.
Supposing my strings are something like:
test1txtfgf10
test1txtfgg2
test2txffdt3
test2txtsdsd1
I want my list to be ordered in numeric order on the basis of the first integer and then on the second, so I would have:
test1txtfgg2
test1txtfgf10
test2txtsdsd1
test2txffdt3
I think I could extract the integer values, sort only them keeping track of what string they belong to and then ordering the strings, but I was wondering if there's a way to do this thing in a more efficient and elegant way.
Thanks in advance
Upvotes: 1
Views: 149
Reputation: 46553
Try the following
In [26]: import re
In [27]: f = lambda x: [int(x) for x in re.findall(r'\d+', x)]
In [28]: sorted(strings, key=f)
Out[28]: ['test1txtfgg2', 'test1txtfgf10', 'test2txtsdsd1', 'test2txffdt3']
This uses regex (the re
module) to find all integers in each string, then compares the resulting lists. For example, f('test1txtfgg2')
returns [1, 2]
, which is then compared against other lists.
Upvotes: 4
Reputation: 649
import re
k = [
"test1txtfgf10",
"test1txtfgg2",
"test2txffdt3",
"test2txtsdsd1"
]
tmp = [([e for e in re.split("[a-z]",el) if e], el) for el in k ]
sorted(tmp, key=lambda k: tmp[0])
tmp = [res for cm, res in tmp]
Upvotes: 0
Reputation:
Extract the numeric parts and sort using them
import re
d = """test1txtfgf10
test1txtfgg2
test2txffdt3
test2txtsdsd1"""
lines = d.split("\n")
re_numeric = re.compile("^[^\d]+(\d+)[^\d]+(\d+)$")
def key(line):
"""Returns a tuple (n1, n2) of the numeric parts of line."""
m = re_numeric.match(line)
if m:
return (int(m.groups(1)), int(m.groups(2)))
else:
return None
lines.sort(key=key)
Now lines
are
['test1txtfgg2', 'test1txtfgf10', 'test2txtsdsd1', 'test2txffdt3']
Upvotes: 0