Reputation: 499
I have a list paths_list
which contains the path of files(images) of a particular folder . Example:
['/home/username/images/s1/4.jpg', '/home/username/images/s1/7.jpg',
'/home/username/images/s1/6.jpg', '/home/username/images/s1/3.jpg',
'/home/username/images/s1/5.jpg', '/home/username/images/s1/10.jpg',
'/home/username/images/s1/9.jpg', '/home/username/images/s1/1.jpg',
'/home/username/images/s1/2.jpg', '/home/username/images/s1/12.jpg',
'/home/username/images/s1/11.jpg', '/home/username/images/s1/8.jpg']
I want to sort then in the order: [/1.jpg ,2.jpg .....,/12.jpg]
Neither sorting via length nor via alphabetical order is helping. What should be done here?
Upvotes: 9
Views: 19112
Reputation: 500
To piggyback off of Shir's answer, if your file names are version numbers such as 1.0.ext
, 2.3.4.ext
, 3.0.ext
, you can use:
import re
from pathlib import Path
files = Path('/your/path/here').glob('*.ext')
files = [
f for f in files
if re.match("[0-9]+\.[0-9]+\.?[0-9]*", f.stem)
]
files = sorted(
files,
key=lambda s: [int(u) for u in s.stem.split('.')]
)
Upvotes: 0
Reputation: 583
I find this neat
from pathlib import Path # pathlib comes with python
sorted_files = sorted(files, key=lambda image_path: Path(image_path).name)
Upvotes: 0
Reputation: 117856
You can use sorted
with a lambda
. For the sorting criteria, you can use os
to first pull just the file name (using basename
), then you can split off just the filename less the extension (using splitext
).
Lastly convert to int
so you sort numerically instead of lexicographically.
>>> import os
>>> l = ['/home/username/images/s1/4.jpg', '/home/username/images/s1/7.jpg', '/home/username/images/s1/6.jpg', '/home/username/images/s1/3.jpg', '/home/username/images/s1/5.jpg', '/home/username/images/s1/10.jpg', '/home/username/images/s1/9.jpg', '/home/username/images/s1/1.jpg', '/home/username/images/s1/2.jpg', '/home/username/images/s1/12.jpg', '/home/username/images/s1/11.jpg', '/home/username/images/s1/8.jpg']
>>> sorted(l, key=lambda i: int(os.path.splitext(os.path.basename(i))[0]))
['/home/username/images/s1/1.jpg',
'/home/username/images/s1/2.jpg',
'/home/username/images/s1/3.jpg',
'/home/username/images/s1/4.jpg',
'/home/username/images/s1/5.jpg',
'/home/username/images/s1/6.jpg',
'/home/username/images/s1/7.jpg',
'/home/username/images/s1/8.jpg',
'/home/username/images/s1/9.jpg',
'/home/username/images/s1/10.jpg',
'/home/username/images/s1/11.jpg',
'/home/username/images/s1/12.jpg']
Upvotes: 19
Reputation: 1649
Inspired by @Cory Kramer's answer, you can use the pathlib
library and get a natural sort of the paths:
from pathlib import Path
a = ['/home/username/images/s1/4.jpg',
'/home/username/images/s1/7.jpg',
'/home/username/images/s1/6.jpg',
'/home/username/images/s1/3.jpg',
'/home/username/images/s1/5.jpg',
'/home/username/images/s1/10.jpg',
'/home/username/images/s1/9.jpg',
'/home/username/images/s1/1.jpg',
'/home/username/images/s1/2.jpg',
'/home/username/images/s1/12.jpg',
'/home/username/images/s1/11.jpg',
'/home/username/images/s1/8.jpg']
a = [Path(i) for i in a]
sorted_a = sorted(a, key=lambda i: int(i.stem))
sorted_a = [str(i) for i in a]
output:
['/home/username/images/s1/1.jpg',
'/home/username/images/s1/2.jpg',
'/home/username/images/s1/3.jpg',
'/home/username/images/s1/4.jpg',
'/home/username/images/s1/5.jpg',
'/home/username/images/s1/6.jpg',
'/home/username/images/s1/7.jpg',
'/home/username/images/s1/8.jpg',
'/home/username/images/s1/9.jpg',
'/home/username/images/s1/10.jpg',
'/home/username/images/s1/11.jpg',
'/home/username/images/s1/12.jpg']
In general, using pathlib
can sometimes give cleaner code expressions than plane os.path
.
Upvotes: 5
Reputation: 1003
You can use split on "/", take the last element, split on ".", take the first, and convert it too an int:
l = ['/home/username/images/s1/4.jpg', '/home/username/images/s1/7.jpg', '/home/username/images/s1/6.jpg', '/home/username/images/s1/3.jpg', '/home/username/images/s1/5.jpg', '/home/username/images/s1/10.jpg', '/home/username/images/s1/9.jpg', '/home/username/images/s1/1.jpg', '/home/username/images/s1/2.jpg', '/home/username/images/s1/12.jpg', '/home/username/images/s1/11.jpg', '/home/username/images/s1/8.jpg']
sorted_list = sorted(l, key = lambda x: int(x.split("/")[-1].split(".")[0]))
output
['/home/username/images/s1/1.jpg',
'/home/username/images/s1/2.jpg',
'/home/username/images/s1/3.jpg',
'/home/username/images/s1/4.jpg',
'/home/username/images/s1/5.jpg',
'/home/username/images/s1/6.jpg',
'/home/username/images/s1/7.jpg',
'/home/username/images/s1/8.jpg',
'/home/username/images/s1/9.jpg',
'/home/username/images/s1/10.jpg',
'/home/username/images/s1/11.jpg',
'/home/username/images/s1/12.jpg']
Upvotes: 1
Reputation: 2163
Use natural sorting (see this question): clean code and good practice when sorting strings.
from natsort import natsorted
l = ['/home/username/images/s1/4.jpg', '/home/username/images/s1/7.jpg', '/home/username/images/s1/6.jpg', '/home/username/images/s1/3.jpg', '/home/username/images/s1/5.jpg', '/home/username/images/s1/10.jpg', '/home/username/images/s1/9.jpg', '/home/username/images/s1/1.jpg', '/home/username/images/s1/2.jpg', '/home/username/images/s1/12.jpg', '/home/username/images/s1/11.jpg', '/home/username/images/s1/8.jpg']
natsorted(l)
gives
['/home/username/images/s1/1.jpg',
'/home/username/images/s1/2.jpg',
'/home/username/images/s1/3.jpg',
'/home/username/images/s1/4.jpg',
'/home/username/images/s1/5.jpg',
'/home/username/images/s1/6.jpg',
'/home/username/images/s1/7.jpg',
'/home/username/images/s1/8.jpg',
'/home/username/images/s1/9.jpg',
'/home/username/images/s1/10.jpg',
'/home/username/images/s1/11.jpg',
'/home/username/images/s1/12.jpg']
Natural sorting sorts based on how you would read things on a computer screen (alphabetically and numerically), rather than how the computer reads the code.
Upvotes: 13
Reputation: 2642
The other answers here are good. But anyhow I would like to post mine with some explanations
from os.path import basename,splitext
path_list = ['/home/username/images/s1/4.jpg', '/home/username/images/s1/7.jpg',
'/home/username/images/s1/6.jpg', '/home/username/images/s1/3.jpg',
'/home/username/images/s1/5.jpg', '/home/username/images/s1/10.jpg',
'/home/username/images/s1/9.jpg', '/home/username/images/s1/1.jpg',
'/home/username/images/s1/2.jpg', '/home/username/images/s1/12.jpg',
'/home/username/images/s1/11.jpg', '/home/username/images/s1/8.jpg']
new_list = [splitext(basename(x))[0] for x in path_list]
fin_list = list(zip(path_list,new_list))
fin_list = [x[0] for x in sorted(fin_list,key=lambda x: int(x[1]))]
print(fin_list)
1) Creates a list which has only the file name. 1,2,..
and so on.
new_list = [splitext(basename(x))[0] for x in path_list]
Note: Why [0] ?? Because the output of each splitext(basename(x))[0]
would be like this,
('1','.jpg') , ('4','.jpg')
so [0] 0th
index gives us just the filename!
2) zip each and every item from both iterables with each other and create a list. So this list has values like these,
fin_list = list(zip(path_list,new_list))
#output
('/home/username/images/s1/4.jpg','4.jpg')
3) [x[0] for x in sorted(fin_list,key=lambda x: int(x[1]))]
This one creates a list from the sorted list of fin_list
note key is the main thing here. Key will be the second item from tuple i.e 4,3,7,..
and such. Based on which sorting happens.
finally your output:
['/home/username/images/s1/1.jpg', '/home/username/images/s1/2.jpg',
'/home/username/images/s1/3.jpg', '/home/username/images/s1/4.jpg',
'/home/username/images/s1/5.jpg', '/home/username/images/s1/6.jpg',
'/home/username/images/s1/7.jpg', '/home/username/images/s1/8.jpg',
'/home/username/images/s1/9.jpg', '/home/username/images/s1/10.jpg',
'/home/username/images/s1/11.jpg', '/home/username/images/s1/12.jpg']
Upvotes: 1