Reputation: 740
I want to use a user-provided string as a filename for exporting, but have to make sure that the string is permissible on my system as a filename. From my side it would be OK to replace any forbidden character with e.g. '_'.
Here I found a list of forbidden characters for filenames.
It should be easy enough to use the str.replace()
function, I was just wondering if there is already something out there that does that, potentially even taking into account what OS I am on.
Upvotes: 10
Views: 13527
Reputation: 859
python-slugify is a package that provides this functionality:
from slugify import slugify
invalid_filename = "inv#lid"
valid_filename = slugify(invalid_filename) # 'inv-lid'
Upvotes: 0
Reputation: 695
Depending on your use case it might be easier to whitelist characters that are allowed in filename instead of attempting to construct a blacklist.
A canonical way would be to check if each character in your filename to be is contained in the list of portable posix filename characters.
https://www.ibm.com/docs/en/zos/2.1.0?topic=locales-posix-portable-file-name-character-set
Uppercase A to Z
Lowercase a to z
Numbers 0 to 9
Period (.)
Underscore (_)
Hyphen (-)
Based on this you can then:
ok = ".-_0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
for character in filename:
assert character in ok
Upvotes: 2
Reputation: 78583
A better solution may be for you to store the files locally using generated filenames that are guaranteed to be unique and file system safe (any UUID generator would do, for example). Maintain a simple database that maps between the original filename and the UUID for later use.
Upvotes: 0
Reputation: 6012
pathvalidate is a Python library to sanitize/validate a string such as filenames/file-paths/etc.
This library provides both utilities for validation of paths:
import sys
from pathvalidate import ValidationError, validate_filename
try:
validate_filename("fi:l*e/p\"a?t>h|.t<xt")
except ValidationError as e:
print("{}\n".format(e), file=sys.stderr)
And utilities for sanitizing paths:
from pathvalidate import sanitize_filename
fname = "fi:l*e/p\"a?t>h|.t<xt"
print("{} -> {}".format(fname, sanitize_filename(fname)))
Upvotes: 10