Reputation: 319
I'm trying to get all files with excel format extensions, therefore I thought this would select any file that has xls in the filename. It would pick up on xls, xlsx, xlsm etc.
the path is a variable defined as the folder I'm extracting these files from and all_files is storing these files. shouldn't the /* define any file that has .xls in it? /*.xlsx
or /*.xlsm
works fine.
all_files=glob.glob(path + "/*.xls/*")
Upvotes: 2
Views: 1129
Reputation: 42017
You are trying to get all files that have .xls
in them, and you're trying the glob pattern:
/*.xls/*
This will find directories (note the trailing /
) that end in .xls
, not files.
You need:
glob.glob(path + "/*.xls*")
but that would not be precise, as this would match any file having just the string .xls
in them e.g. foo.xlsbar
.
The problem is that the standard shell globbing (even leveraging []
, ?
would not do here) is not so flexible as Regex as needed here, you can wrap the glob in some Regex check afterwards:
import glob
import re
req = re.compile(r'\.xls[xm]?$')
all_files = list(filter(lambda x: req.search(x), glob.iglob(path + '/*.xls*')))
Upvotes: 1
Reputation: 1522
You have an extra "/" in your expression. To add the wildcard to the end of ".xls" you need:
all_files=glob.glob(path + "/*.xls*")
Upvotes: 0