optagon
optagon

Reputation: 11

Sorting strings in Python with numbers somewhere in the middle

I want to find a way to sort strings that have numbers in them by their numerical size.

I found one way to sort strings that contain only numbers, which works well (Sorting numbers in string format with Python) but not when the string is a mix of words and numbers.

In this example I am creating the list in the order that I want, but the sorted() ruins it.

>>> s = ['A_10x05', 'A_10x50', 'A_10x100']
>>> print(sorted(s))
['A_10x05', 'A_10x100', 'A_10x50']

Expected output

['A_10x05', 'A_10x50', 'A_10x100']

A more complex example would be:

>>> s = ['Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02',
'Asset_Castle_Wall_25x400x10_Bottom_01',  'Asset_Castle_Wall_25x400x300_Top_01']
>>> print(sorted(s))
['Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x300_Top_01', 'Asset_Castle_Wall_25x400x50_Top_02']

Expected output:

['Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02', 'Asset_Castle_Wall_25x400x100_Bottom_01',  'Asset_Castle_Wall_25x400x300_Top_01']

I am thinking I would need to split the string by numbers and sort each part, and I can sort the number parts using the solution above. Then where there are multiple strings that start the same way i.e section[i] = ('A_') I sort section[i+1] and work my way to the end. This feels very complicated though, so maybe there is a better way.

Upvotes: 0

Views: 745

Answers (3)

Adon Bilivit
Adon Bilivit

Reputation: 27116

Providing each string in the list contains exactly three dimensions:

import re
from functools import cache

s = ['Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02',
'Asset_Castle_Wall_25x400x10_Bottom_01',  'Asset_Castle_Wall_25x400x300_Top_01']

@cache
def get_size(s):
    if len(tokens := s.split('x')) != 3:
        return 0
    first = re.findall('(\d+)', tokens[0])[-1]
    last = re.findall('(\d+)', tokens[-1])[0]
    return int(first) * int(tokens[1]) * int(last)

print(sorted(s, key=get_size))

Output:

['Asset_Castle_Wall_25x400x10_Bottom_01', 'Asset_Castle_Wall_25x400x50_Top_02', 'Asset_Castle_Wall_25x400x100_Bottom_01', 'Asset_Castle_Wall_25x400x300_Top_01']

Upvotes: 0

gimix
gimix

Reputation: 3833

I believe what you want is just to sort each part of the input strings separately - text parts alphabetically, numeric parts by numeric value, with no multiplications involved. If this is the case you will need a helper function:

from re import findall

s = ['A_10x5', 'Item_A_10x05x200_Base_01', 'A_10x100', 'B']

def fun(s):
    f = findall(r'\d+|[A-Za-z_]+',s)
    return list(map(lambda x:int(x) if x.isdigit() else x, f))

sorted(s, key = fun)
['A_10x5', 'A_10x100', 'B', 'Item_A_10x05x200_Base_01']

Upvotes: 0

Mortz
Mortz

Reputation: 4939

IIUC, you are trying to multiply the numbers in 10x05 - which you can do by passing a key function to sorted

def eval_result(s):
    prefix, op = s.split('_')
    num1, num2 = map(int, op.split('x'))
    return num1 * num2
sorted(s, key=eval_result)

Output

['A_10x05', 'A_10x50', 'A_10x100']

Upvotes: 1

Related Questions