Reputation: 129
I have a list whose elements are like the following:
Region_1.csv, Region_33.csv, Region_2.csv, Region_4.csv, Region_105.csv, ....
The list has all numbers ranging from 1-105 with none missing. I want to sort this list according to the region number such that it looks like this:
Region_1.csv, Region_2.csv, Region_3.csv, Region_4.csv, Region_105.csv etc.
Since the numbers have variable digits, I am struggling to sort this list.
Thanks.
Upvotes: 1
Views: 141
Reputation: 26315
You could also try this:
>>> l = ['Region_105.csv', 'Region_1.csv', 'Region_33.csv', 'Region_2.csv', 'Region_4.csv']
>>> sorted(l, key=lambda x: int(''.join(filter(str.isdigit, x))))
['Region_1.csv', 'Region_2.csv', 'Region_4.csv', 'Region_33.csv', 'Region_105.csv']
Upvotes: 1
Reputation: 7846
You can also use the find
method of strings:
inList = ['Region_1.csv', 'Region_33.csv', 'Region_2.csv', 'Region_4.csv', 'Region_105.csv']
outList = sorted(inList, key=lambda elem: int(elem[elem.find('_')+1:elem.find('.')]))
print(outList)
Output:
['Region_1.csv', 'Region_2.csv', 'Region_4.csv', 'Region_33.csv', 'Region_105.csv']
Upvotes: 1
Reputation: 1
You can extract the region number using regular expressions and create a dictionary of the form {region number: fileName} and then sort the dictionary based on the keys. Code for extracting region number and create dictionary:
import re
files=['Region_1.csv','Region_33.csv','Region_2.csv','Region_4.csv','Region_105.csv']
d=dict()
for f in files:
rnum=re.find('[a-bA-B]_([0-9])\.csv$',f)
d[rnum]=f
For sorting the items in dictionary, refer : How can I sort a dictionary by key?
Upvotes: 0
Reputation: 195428
Using re
module, if you want to find in string something fancy:
l = ['Region_105.csv', 'Region_1.csv', 'Region_33.csv', 'Region_2.csv', 'Region_4.csv']
import re
print(sorted(l, key=lambda v: int(re.findall('\d+', v)[0])))
Output:
['Region_1.csv', 'Region_2.csv', 'Region_4.csv', 'Region_33.csv', 'Region_105.csv']
Upvotes: 2
Reputation: 12015
lst.sort(key=lambda x: int(x.split('_')[1].split('.')[0]))
print(lst)
# ['Region_1.csv', 'Region_2.csv', 'Region_4.csv', 'Region_33.csv', 'Region_105.csv']
Upvotes: 2
Reputation: 164643
You can use sorted
with a custom function, splitting first by .
and then by _
:
res = sorted(L, key=lambda x: int(x.split('.')[0].split('_')[-1]))
print(res)
['Region_1.csv', 'Region_2.csv', 'Region_4.csv', 'Region_33.csv', 'Region_105.csv']
Upvotes: 4