Reputation: 1923
Suppose I have a filename like: D_Passaic_F01_NBR_E0003.tif
thats in a folder that I am iterating through using Python. Suppose I want to get all of the files between E0001 and E0010. I might make a list like: select_libr = ['E0001', 'E0002', 'E0003', 'E0003', 'E0005', 'E0006', 'E0007', 'E0008', 'E0009', 'E0010']
. Using this list, how can I check the filename will iterating through the directory to just pull out those key files?
for filename in os.listdir(directory):
if (filename.startswith("D_")) and (filename.endswith(".tif") or filename.endswith(".tiff")):
print(os.path.join(directory, filename))
else:
continue
What I want to do is something like: ...and (item in select_libr in filename)
but I am not sure the correct syntax here. Any suggestions?
Upvotes: 0
Views: 47
Reputation: 25500
You can use a regular expression to extract the number from Exxx
and then do what you want with it. For example,
E(\d+)\.tiff?$
will match E
, then one or more digits, then .tif
, followed by an optional f
at the end of the string. More importantly, it captures the digits as a group and allows us to pull just the digits out of the match object. Try it
for filename in os.listdir(directory):
research = re.search(r"E(\d+)\.tiff?", filename)
if research: # If there was a match
fnum = research.group(1) # This is the string "0003", for example
# Then do whatever you want with it
if 0 <= int(fnum) <= 10:
print(filename)
If you want to allow arbitrary values, I highly recommend using a set
instead of a list
to store those values, because checking for membership in a set is cheaper than in a list.
select_libr = {'E0001', 'E0002', 'E0003', 'E0003', 'E0005', 'E0006', 'E0007', 'E0008', 'E0009', 'E0010'}
And change the regex so that the E
is also captured: (E\d+)\.tiff?
Try it
for filename in os.listdir(directory):
research = re.search(r"(E\d+)\.tiff?", filename)
if research: # If there was a match
fnum = research.group(1) # This is the string "E0003", for example
# Then do whatever you want with it
if fnum in select_libr:
print(filename)
To ensure your filename starts with a D_
, you can prepend ^D_.*?
to the other regexes. This looks for a D_
at the start of the string, followed by as many of any character. Everything else can remain the same. Try it
Upvotes: 1