Penguin
Penguin

Reputation: 2401

How to sort a directory's files?

I have a directory with lots of images. I'm trying to create a gif from them. Everything works, except that the method I'm using to sort the images doesn't sort them exactly:

#save images as gif

import glob
from PIL import Image

# filepaths
fp_in = "images/epoch*.png"
fp_out = "images/vid.gif"

# https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html#gif
img, *imgs = [Image.open(f) for f in sorted(glob.glob(fp_in))]
img.save(fp=fp_out, format='GIF', append_images=imgs,
         save_all=True, duration=500) #, loop=0

When I print sorted(glob.glob(fp_in)) I get

['images/epoch0.png',
 'images/epoch1000.png',
 'images/epoch10000.png',
 'images/epoch11000.png',
 'images/epoch12000.png',
 'images/epoch13000.png',
 'images/epoch14000.png',
 'images/epoch15000.png',
 'images/epoch16000.png',
 'images/epoch17000.png',
 'images/epoch18000.png',
 'images/epoch19000.png',
 'images/epoch2000.png',
 'images/epoch20000.png',
...

You can see that it goes from 1000 to 10000, and from 2000 to 20000, etc. What is the correct way to do that?

Upvotes: 0

Views: 193

Answers (3)

Martin Nečas
Martin Nečas

Reputation: 593

You can use key parameter in sorted.

For your example you can use:

sorted(fp_in, key=lambda x:int(x[:-4].split("images/epoch")[1]))

which sorts the images by the number as an int.

(bit hack the separation of number from the name you can do it many other ways)

Upvotes: 2

balderman
balderman

Reputation: 23815

Try the below (the idea is to extract the number from file name and sort by the number)

lst = ['images/epoch0.png',
       'images/epoch1000.png',
       'images/epoch10000.png',
       'images/epoch11000.png',
       'images/epoch12000.png',
       'images/epoch13000.png',
       'images/epoch14000.png',
       'images/epoch15000.png',
       'images/epoch16000.png',
       'images/epoch17000.png',
       'images/epoch18000.png',
       'images/epoch19000.png',
       'images/epoch2000.png',
       'images/epoch20000.png']

data = []
for entry in lst:
    num = entry[entry.find('epoch') + 5: entry.find('.')]
    data.append((int(num), entry))
sorted(data, key=lambda x: x[0])
sorted_lst = [x[1] for x in data]
print(sorted_lst)

output

['images/epoch0.png', 'images/epoch1000.png', 'images/epoch10000.png', 'images/epoch11000.png', 'images/epoch12000.png', 'images/epoch13000.png', 'images/epoch14000.png', 'images/epoch15000.png', 'images/epoch16000.png', 'images/epoch17000.png', 'images/epoch18000.png', 'images/epoch19000.png', 'images/epoch2000.png', 'images/epoch20000.png']

Upvotes: 1

TheMikeste1
TheMikeste1

Reputation: 87

The way it is currently sorting is called lexicographical sorting and is commonly used for strings. If you want a natural sort, I'd recommend looking at this post: Sort results non-lexicographically?

Upvotes: 0

Related Questions