Baktaawar
Baktaawar

Reputation: 7490

Glob fetching pdf files with lower or upper .pdf extension

I have a piece of code which traverses through the directory using os.walk and then in the corresponding directoy gets the file list of all pdf files there.

For getting the list of pdf files in a particular directory after traversal I use glob like below:

file_list = glob.glob(os.path.join(root,invoice_dir_name, "*.pdf"))

It fetches all files in a directory which end with .pdf.

But I just found a corner case where if the directory has pdf files but if they end in .PDF it returns empty string as it's looking for lower case .pdf extension.

How can I add regular expression in the glob function so it can fetch either of .pdf or .PDF. I tried

file_list = glob.glob(os.path.join(root,invoice_dir_name, "*.(pdf|PDF)"))

but obviously it doesn't work

My code uses glob and os.walk and any other things asked to use would be a redo of code so I was wondering if a soln can be found with glob. Thanks

Upvotes: 0

Views: 1418

Answers (1)

Abhilash
Abhilash

Reputation: 2256

How about searching for .pdf & .PDF separately and collecting the info into single list? This way, matches found for both patterns would be combined and returned.

def get_files(root, dir_name, pattern):
    patterns = [os.path.join(root, dir_name, pattern.upper()), os.path.join(root, dir_name, pattern.lower())]
    return [filename for p in patterns for filename in glob.glob(p)]

If not a new function, then simply replace:

file_list = glob.glob(os.path.join(root,invoice_dir_name, "*.pdf"))

with:

pattern = "*.pdf"
p_lower = os.path.join(root, dir_name, pattern.upper())
p_upper = os.path.join(root, dir_name, pattern.lower())
file_list = [fname for p in (p_lower, p_upper) for fname in glob.glob(p)]

Output:

[
  '/Users/username/docs/37-sbc-sleep-apnea-2018.PDF',
  '/Users/username/docs/notice.pdf',
  '/Users/username/docs/Health2020.pdf',
  '/Users/username/docs/West.pdf',
  '/Users/username/docs/hello-Health-net-excel-file-2020.pdf', 
  '/Users/username/docs/2018-arbitration-form-english.pdf'
]

Upvotes: 1

Related Questions