jef
jef

Reputation: 4075

Sort string list by a number in string?

I have the following string list. Then, I want to sort it by a number in each element. sorted failed because it cannot handle the order such as between 10 and 3. I can imagine if I use re, I can do it. But it is not interesting. Do you guys have nice implementation ideas? I suppose python 3.x for this code.

names = [
'Test-1.model',
'Test-4.model',
'Test-6.model',
'Test-8.model',
'Test-10.model',
'Test-20.model'
]
number_sorted = get_number_sorted(names)
print(number_sorted)
'Test-20.model'
'Test-10.model'
'Test-8.model'
'Test-6.model'
'Test-4.model'
'Test-1.model'

Upvotes: 10

Views: 13371

Answers (7)

jpp
jpp

Reputation: 164623

Some alternatives:

(1) Slicing by position:

sorted(names, key=lambda x: int(x[5:-6]))

(2) Stripping substrings:

sorted(names, key=lambda x: int(x.replace('Test-', '').replace('.model', '')))

Or better (Pandas version >3.9):

x.removeprefix('Test-').removesuffix('.model')

(3) Splitting characters (also possible via str.partition):

sorted(names, key=lambda x: int(x.split('-')[1].split('.')[0]))

(4) Map with np.argsort on any of (1)-(3):

list(map(names.__getitem__, np.argsort([int(x[5:-6]) for x in names])))

Upvotes: 5

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520908

Here is a regex based approach. We can extract the test number from the string, cast to int, and then sort by that.

import re

def grp(txt): 
    s = re.search(r'Test-(\d+)\.model', txt, re.IGNORECASE)
    if s:
        return int(s.group(1))
    else:
        return float('-inf')  # Sorts non-matching strings ahead of matching strings

names.sort(key=grp)

Upvotes: 1

x1084
x1084

Reputation: 330

You can use the key parameter along with sorted() to accomplish this, assuming each string is formatted the same way:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]))

It looks like you might want your list reverse sorted (?), in which case you can add reverse=True as such:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]), reverse=True)
number_sorted = get_number_sorted(names)
print(number_sorted)
['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']

See related: Key Functions

Upvotes: 1

Ajax1234
Ajax1234

Reputation: 71451

You can use re.findall in with the key of the sort function:

import re
names = [
 'Test-1.model',
 'Test-4.model',
 'Test-6.model',
 'Test-8.model',
 'Test-10.model',
 'Test-20.model'
]
final_data = sorted(names, key=lambda x:int(re.findall('(?<=Test-)\d+', x)[0]), reverse=True)

Output:

['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']

Upvotes: 2

Claudiordgz
Claudiordgz

Reputation: 3049

def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

and then do something like

 sorted(names, key=lambda x: int(find_between(x, 'Test-', '.model')))

Upvotes: 1

jef
jef

Reputation: 4075

I found a similar question and a solution by myself. Nonalphanumeric list order from os.listdir() in Python

import re
def sorted_alphanumeric(data):
    convert = lambda text: int(text) if text.isdigit() else text.lower()
    alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
    return sorted(data, key=alphanum_key, reverse=True)

Upvotes: 3

Back2Basics
Back2Basics

Reputation: 7806

the key is ... the key

sorted(names, key=lambda x: int(x.partition('-')[2].partition('.')[0]))

Getting that part of the string recognized as the sort order by separating it out and transforming it to an int.

Upvotes: 7

Related Questions