Reputation: 2990
What would be the most conservative way to check if a file-name is valid in Python on all platforms (including mobile platforms like Android, iOS)?
Ex.
this_is_valid_name.jpg -> Valid
**adad.jpg -> Invalid
a/ad -> Invalid
Upvotes: 5
Views: 16884
Reputation: 4090
I did a function myself. I used @Voo answer as a start and added checks based on this answer.
import re
def is_valid_folder_name(name: str):
# Define a regular expression pattern to match forbidden characters
ILLEGAL_NTFS_CHARS = r'[<>:/\\|?*\"]|[\0-\31]'
# Define a list of forbidden names
FORBIDDEN_NAMES = ['CON', 'PRN', 'AUX', 'NUL',
'COM1', 'COM2', 'COM3', 'COM4', 'COM5',
'COM6', 'COM7', 'COM8', 'COM9',
'LPT1', 'LPT2', 'LPT3', 'LPT4', 'LPT5',
'LPT6', 'LPT7', 'LPT8', 'LPT9']
# Check for forbidden characters
match = re.search(ILLEGAL_NTFS_CHARS, name)
if match:
raise ValueError(
f"Invalid character '{match[0]}' for filename {name}")
# Check for forbidden names
if name.upper() in FORBIDDEN_NAMES:
raise ValueError(f"{name} is a reserved folder name in windows")
# Check for empty name (disallowed in Windows)
if name.strip() == "":
raise ValueError("Empty file name not allowed in Windows")
# Check for names starting or ending with dot or space
match = re.match(r'^[. ]|.*[. ]$', name)
if match:
raise ValueError(
f"Invalid start or end character ('{match[0]}')"
f" in folder name {name}"
)
In your example:
$ is_valid_folder_name('this_is_valid_name.jpg')
$ is_valid_folder_name('**adad.jpg')
---------------------------------------------------------------------------
ValueError in is_valid_folder_name(name)
13 match = re.search(ILLEGAL_NTFS_CHARS, name)
14 if match:
---> 15 raise ValueError(
16 f"Invalid character {match[0]} for filename {name}")
17 # Check for forbidden names
ValueError: Invalid character '*' for filename **adad.jpg
$ is_valid_folder_name('a/ad')
---------------------------------------------------------------------------
ValueError in is_valid_folder_name(name)
13 match = re.search(ILLEGAL_NTFS_CHARS, name)
14 if match:
---> 15 raise ValueError(
16 f"Invalid character {match[0]} for filename {name}")
17 # Check for forbidden names
ValueError: Invalid character '/' for filename a/ad
Please, if someone finds I missed something be free to add it or comment!
Upvotes: 1
Reputation: 39
Related topic is: "Filename Pattern Matching|.
These are the methods and functions available to you:
endswith() and startswith() string methods
fnmatch.fnmatch()
glob.glob()
pathlib.Path.glob()
import os
# Get .txt files
for f_name in os.listdir('some_directory'):
if f_name.endswith('.txt'):
print(f_name)
Simple Filename Pattern Matching Using fnmatch( )
import os
import fnmatch
for file_name in os.listdir('some_directory/'):
if fnmatch.fnmatch(file_name, '*.txt'):
print(file_name)
More Advanced Pattern Matching
for filename in os.listdir('.'):
if fnmatch.fnmatch(filename, 'data_*_backup.txt'):
print(filename)
Filename Pattern Matching Using glob
import glob
glob.glob('*.py')
OR Code as
import glob
for name in glob.glob('*[0-9]*.txt'):
print(name)
OR Match as
import glob
for file in glob.iglob('**/*.py', recursive=True):
print(file)
OR Code as
from pathlib import Path
p = Path('.')
for name in p.glob('*.p*'):
print(name)
Upvotes: -2
Reputation: 1931
The most harsh way to check if a file would be a valid filename on you target OSes is to check it against a list of properly tested filenames.
valid = myfilename in ['this_is_valid_name.jpg']
Expanding on that, you could define a set of characters that you know are allowed in filenames on every platform :
valid = set(valid_char_sequence).issuperset(myfilename)
But this is not going to be enough, as some OSes have reserved filenames.
You need to either exclude reserved names or create an expression (regexp) matching the OS allowed filename domain, and test your filename against that, for each target platform.
AFAIK, Python does not offer such helpers, because it's Easier to Ask Forgiveness than Permission. There's a lot of different possible combinations of OSes/filesystems, it's easier to react appropriately when the os raises an exception than to check for a safe filename domain for all of them.
Upvotes: 4