Reputation: 1187
I have a string
deplete mineral resources , from 123 in x 123 in x 19 ft , on 24 ft t shaped hole
and a list of strings
['123', '123', '19', '24', 'in', 'in', 'ft', 'ft', 'deplete mineral', 't', 'resources', 'shaped hole']
I want to sort this list based on the given string. When I did sorted(l, key=s.index)
, I am getting the output as:
['deplete mineral', 't', 'in', 'in', 'resources', '123', '123', '19', 'ft', 'ft', '24', 'shaped hole']
But my desired output is:
['deplete mineral', 'resources', '123', 'in' , '123', 'in' , '19', 'ft', '24', 'ft', 't' , 'shaped hole']
The list should be sorted exactly as the string given. Is there an efficient way to achieve this?
Upvotes: 0
Views: 657
Reputation: 5385
This produces the desired pattern. It's not technically a sort though - just a regular expression search of the sort string.
>>> import re
>>>
>>> sort_str = "deplete mineral resources , from 123 in x 123 in x " \
... "19 ft , on 24 ft t shaped hole"
>>>
>>> str_list = ['123', '123', '19', '24', 'in', 'in', 'ft', 'ft',
... 'deplete mineral', 't', 'resources', 'shaped hole']
>>>
>>> re.findall('|'.join(str_list), sort_str)
['deplete mineral', 'resources', '123', 'in', '123', 'in', '19',
'ft', '24', 'ft', 't', 'shaped hole']
>>>
>>>
>>> desired = ['deplete mineral', 'resources', '123', 'in' , '123',
... 'in' , '19', 'ft', '24', 'ft', 't' , 'shaped hole']
>>> desired == re.findall('|'.join(str_list), sort_str)
True
The regular expression is simple. It's of the form "alt_1|alt_2|alt_3"
. What that OR-like expression produces is a pattern matcher that scans a string looking for the substrings "alt_1", "alt_2", or "alt_3".
str_list
is joined together to form this OR-like expression in this simple fashion:
>>> '|'.join(str_list)
'123|123|19|24|in|in|ft|ft|deplete mineral|t|resources|shaped hole'
The ordering of the above expression isn't important - they could be in any order.
This string expression is turned into a regular expression internally when passed in as the first parameter to re.findall()
and used to find all matching substrings in sort_str
with the following line:
>>> re.findall('|'.join(str_list), sort_str)
re.findall()
scans sort_str
from beginning to end looking for substrings that are part of str_list
. Each occurrence is added to the list it returns.
So the substrings matched will be in the same order as the words in sort_str
.
Upvotes: 1