Matthias Arras
Matthias Arras

Reputation: 740

Elegant way in python to make sure a string is suitable as a filename?

I want to use a user-provided string as a filename for exporting, but have to make sure that the string is permissible on my system as a filename. From my side it would be OK to replace any forbidden character with e.g. '_'.

Here I found a list of forbidden characters for filenames.

It should be easy enough to use the str.replace() function, I was just wondering if there is already something out there that does that, potentially even taking into account what OS I am on.

Upvotes: 10

Views: 13527

Answers (4)

phispi
phispi

Reputation: 859

python-slugify is a package that provides this functionality:

from slugify import slugify

invalid_filename = "inv#lid"
valid_filename = slugify(invalid_filename)  # 'inv-lid'

Upvotes: 0

Markus Hirsimäki
Markus Hirsimäki

Reputation: 695

Depending on your use case it might be easier to whitelist characters that are allowed in filename instead of attempting to construct a blacklist.

A canonical way would be to check if each character in your filename to be is contained in the list of portable posix filename characters.

https://www.ibm.com/docs/en/zos/2.1.0?topic=locales-posix-portable-file-name-character-set

Uppercase A to Z
Lowercase a to z
Numbers 0 to 9
Period (.)
Underscore (_)
Hyphen (-)

Based on this you can then:

ok = ".-_0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
for character in filename:
    assert character in ok

        

Upvotes: 2

jarmod
jarmod

Reputation: 78583

A better solution may be for you to store the files locally using generated filenames that are guaranteed to be unique and file system safe (any UUID generator would do, for example). Maintain a simple database that maps between the original filename and the UUID for later use.

Upvotes: 0

kmaork
kmaork

Reputation: 6012

pathvalidate is a Python library to sanitize/validate a string such as filenames/file-paths/etc.

This library provides both utilities for validation of paths:

import sys
from pathvalidate import ValidationError, validate_filename

try:
    validate_filename("fi:l*e/p\"a?t>h|.t<xt")
except ValidationError as e:
    print("{}\n".format(e), file=sys.stderr)

And utilities for sanitizing paths:

from pathvalidate import sanitize_filename

fname = "fi:l*e/p\"a?t>h|.t<xt"
print("{} -> {}".format(fname, sanitize_filename(fname)))

Upvotes: 10

Related Questions