Reputation: 1279
I need to sort a list of strings which contains digits at the beginning and end of the string, first by the beginning digits, then by the ending digits. So the beginning digits have priority over the ending digits.
For example:
l = ['900abc5', '3000abc10', '1000abc5', '1000abc10', '900abc20']
Would become:
l = ['900abc5', '900abc20','1000abc5','1000abc10','3000abc10']
I know that l.sort() will not work here as it sorts lexicographically. Any other methods I tried seemed to be excessively complicated (example: splitting the strings by matching beginning digits, then splitting again by ending digits, sorting, concatenating, and then recombining the list) Even summarizing that method shows that it is not efficient!
Edit: after playing around with the natsort module I found that natsorted(l) solves my particular issue.
Upvotes: 0
Views: 1263
Reputation: 48047
You may create a custom function to extract the numbers from string and use that function as a key to sorted()
.
For example: In the below function, I am using regex to extract the number:
import re
def get_nums(my_str):
return list(map(int, re.findall(r'\d+', my_str)))
Refer Python: Extract numbers from a string for more alternatives.
Then make a call to sorted function using get_nums()
as key:
>>> l = ['900abc5', '3000abc10', '1000abc5', '1000abc10', '900abc20']
>>> sorted(l, key=get_nums)
['900abc5', '900abc20', '1000abc5', '1000abc10', '3000abc10']
Note: Based on your example, my regex expression assume that there will be a number only at the start and the end of the string with all intermediate characters in strings as non-numeric.
Upvotes: 4
Reputation: 214927
Here is an option with regex to findout the leading digits and trailing digits and use them as keys in the sorted
function:
import re
sorted(l, key = lambda x: (int(re.findall("^\d+", x)[0]), int(re.findall("\d+$", x)[0])))
# ['900abc5', '900abc20', '1000abc5', '1000abc10', '3000abc10']
Upvotes: 1
Reputation: 1022
Python's sorted
method allows the specification of a key
parameter, which should be a function that transform a list's element into a sorting value. In your case, you want to sort by the digits in the string. For example '900abc5'
, the key would be [900, 5]
, and so on. So you want to pass in a key
function that transform the string into the list of digits.
Using regular expressions, it's quite easy to extract the digits from the string. All you need to do is to map the digits into actual numbers, as regular expressions return string matches.
I believe the code below should work:
import re
l = ['900abc5', '3000abc10', '1000abc5', '1000abc10', '900abc20']
def by_digits(e):
digits_as_string = re.findall(r"\d+", e)
return map(int, digits_as_string)
sorted(l, key=by_digits)
Upvotes: 0