user9593274
user9593274

Reputation: 15

Python Numerical Sorting of Strings

How do I make Python sorting behave like sort -n from GNU coreutils?

This is my images.txt:

Vol. 1/Vol. 1 - Special 1/002.png
Vol. 1/Chapter 2 example text/002.png
Vol. 1/Vol. 1 Extra/002.png
Vol. 1/Chapter 2 example text/001.png
Vol. 1/Vol. 1 Extra/001.png
Vol. 1/Chapter 1 example text/002.png
Vol. 1/Vol. 1 - Special 1/001.png
Vol. 1/Chapter 1 example text/001.png

When I run this Bash script:

#!/bin/bash

cat images.txt | sort -n

I get the following output:

Vol. 1/Chapter 1 example text/001.png
Vol. 1/Chapter 1 example text/002.png
Vol. 1/Chapter 2 example text/001.png
Vol. 1/Chapter 2 example text/002.png
Vol. 1/Vol. 1 Extra/001.png
Vol. 1/Vol. 1 Extra/002.png
Vol. 1/Vol. 1 - Special 1/001.png
Vol. 1/Vol. 1 - Special 1/002.png

But when I run this Python script:

#!/usr/bin/env python3

images = []

with open("images.txt") as images_file:
    for image in images_file:
        images.append(image)

images = sorted(images)

for image in images:
    print(image, end="")

I get the following output, which is not what I need:

Vol. 1/Chapter 1 example text/001.png
Vol. 1/Chapter 1 example text/002.png
Vol. 1/Chapter 2 example text/001.png
Vol. 1/Chapter 2 example text/002.png
Vol. 1/Vol. 1 - Special 1/001.png
Vol. 1/Vol. 1 - Special 1/002.png
Vol. 1/Vol. 1 Extra/001.png
Vol. 1/Vol. 1 Extra/002.png

How do I achieve the same result with Python that I achieve with Bash and sort -n?

Upvotes: 1

Views: 78

Answers (2)

Spencer Post
Spencer Post

Reputation: 16

You might also want to consider a lambda which replaces all non alpha numeric characters

images = sorted(images, key=lambda x: re.sub('[^A-Za-z\d]+', '', x))

Upvotes: 0

jpp
jpp

Reputation: 164733

While I'm no expert, it seems '-' is ordered differently versus Python. A quick fix for this particular issue is to replace ' - ' with ' ' when sorting:

L = sorted(L, key=lambda x: x.replace(' - ', ' '))

Upvotes: 2

Related Questions