Reputation: 115
I am trying to walk through the subdirectories of a parent directory looking for the .xlsx file with the newest date in the file name in each subdirectory. The naming convention for my files will be such that they will start with the date and then filename.
ex. 20180621 file name.xlsx
This way I can find the newest file from each subdirectory and run my script on them.
I have the following code which only works if I have a .xlsx in every directory, including the parent directory. If I do not have a .xlsx in any of the directories, the code returns ValueError: max() arg is an empty sequence
and then it exits without continuing the search.
Parent Directory
----subdirectory1
--------subdirectory1.1
----subdirectory2
----subdirectory3
----etc.
example 1: If parent directory does not contain a .xlsx file, even though the subdirectories do, the code exits with max() empty sequence.
example 2: If there is a folder anywhere in the tree without a .xlsx file, the code exits with max() empty sequence. If subdirectory1.1 doesn't have a .xlsx file it will exit the code and not check subdirectory2 or subdirectory3.
How can I get os.walk
to continue searching through all the subdirectories even after it finds one that does not contain the .xlsx file that I am looking for (including if the parent directory doesn't have a .xlsx file).
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
largest = max(list_of_files)
print (largest)
Upvotes: 0
Views: 371
Reputation: 1121972
os.walk()
can't continue because an exception was raised. Either don't call max()
with an empty list, catch the exception, or tell max()
to return a default value if the list is empty.
You can trivially skip testing for the largest if there are no excel files; if list_of_files:
will be false if the list is empty:
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
largest = None
if list_of_files:
largest = max(list_of_files)
print(largest or 'No Excel files in this directory')
If you are using Python 3.4 or newer, you can also tell the max()
function to return a default value if your input list is empty:
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
largest = max(list_of_files, None) # None is the default value
print(largest or 'No Excel files in this directory')
Last but not least, you can use try...except ValueError:
to handle the exception thrown:
for root, dirs, files in os.walk(path):
list_of_files = []
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
try:
largest = max(list_of_files)
print(largest)
except ValueError:
print('No Excel files in this directory')
You can simplify your code by using the fnmatch.filter()
function to filter out matching files:
import fnmatch
import os
for root, dirs, files in os.walk(path):
excel_files = fnmatch.filter(files, '*.xlsx')
largest = max(list_of_files, None)
Upvotes: 4
Reputation: 54213
It doesn't stop, max
throws an error. You can handle this in a couple of ways:
...
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
if list_of_files: # if it's not blank...
print(max(list_of_files))
or
...
for file in files:
if file.endswith(".xlsx"):
list_of_files.append(file)
try:
print(max(list_of_files))
except ValueError: # something goes wrong with `max` (or `print` I guess)
# what do we do? Probably...
pass
Upvotes: 1