Zhenglun Chen
Zhenglun Chen

Reputation: 11

how to sort alphanumeric string in python with alphabetical order comes first?

I am dealing with a dictionary with alphanumeric key and I need to sort them. The desired order is

a1 : 3
a2 : 2
b1 : 2
b2 : 3
c1 : 5
a1 b2 : 3
a2 b1 : 2
a1 c1 : 3
a2 c1 : 2
b1 c1 : 2
b2 c1 : 3
a1 b2 c1 : 3
a2 b1 c1 : 2

However, What i got so far is

 for key in sorted(s1.keys(),key = lambda item: (len(item),item,)):
      print("%s: %s" % (key, s1[key]))

 a1: 3
 a2: 2
 b1: 2
 b2: 3
 c1: 5
 a1 b2: 3
 a1 c1: 3
 a2 b1: 2
 a2 c1: 2
 b1 c1: 2
 b2 c1: 3
 a1 b2 c1: 3
 a2 b1 c1: 2

The thing is I want to go in the order of A->B->C->AB->AC->BC->ABC first then sort each small group according to the number value, for example, for AB, if I have a1b1,a2b1,a1b2,a2b2, then the order will be a1b1,a1b2,a2b1,a2b2.

Upvotes: 1

Views: 1895

Answers (2)

tobias_k
tobias_k

Reputation: 82899

As a key function, you could split and zip the keys:

>>> s = 'a1 b2 c1'
>>> list(zip(*s.split()))
[('a', 'b', 'c'), ('1', '2', '1')]

To sort b before a b, you also have to take the number of segments into account.

For your s1 data:

>>> sorted(s1, key=lambda s: (s.count(' '), list(zip(*s.split()))))
['a1',
 'a2',
 'b1',
 'b2',
 'c1',
 'a1 b2',
 'a2 b1',
 'a1 c1',
 'a2 c1',
 'b1 c1',
 'b2 c1',
 'a1 b2 c1',
 'a2 b1 c1']

If there can be more than one letter or digit per block, you could use re.findall instead:

>>> s = "aa12 bb34 cc56"
>>> re.findall("[a-z]+", s), re.findall("\d+", s)
(['aa', 'bb', 'cc'], ['12', '34', '56'])

Upvotes: 2

ewcz
ewcz

Reputation: 13087

One possibility would be to extend your approach and explicitly partition the letters and numbers in the creation of the sorting key:

d = {
'a1': 3,
'a2': 2,
'b1': 2,
'b2': 3,
'c1': 5,
'a1 b2': 3,
'a2 b1': 2,
'a1 c1': 3,
'a2 c1': 2,
'b1 c1': 2,
'b2 c1': 3,
'a1 b2 c1': 3,
'a2 b1 c1': 2
}

def fn(key):
    letters = key[0::3] #extract the "letter" part of the key
    idx = key[1::3] #extract the "numeric" part of the key

    #construct the composite key
    return (len(letters), letters, idx)

for key in sorted(d.keys(), key = fn):
    print(key, d[key])

produces

('a1', 3)
('a2', 2)
('b1', 2)
('b2', 3)
('c1', 5)
('a1 b2', 3)
('a2 b1', 2)
('a1 c1', 3)
('a2 c1', 2)
('b1 c1', 2)
('b2 c1', 3)
('a1 b2 c1', 3)
('a2 b1 c1', 2)

Upvotes: 1

Related Questions