gwydion93
gwydion93

Reputation: 1923

Finding if a list item is in a filename

Suppose I have a filename like: D_Passaic_F01_NBR_E0003.tif thats in a folder that I am iterating through using Python. Suppose I want to get all of the files between E0001 and E0010. I might make a list like: select_libr = ['E0001', 'E0002', 'E0003', 'E0003', 'E0005', 'E0006', 'E0007', 'E0008', 'E0009', 'E0010']. Using this list, how can I check the filename will iterating through the directory to just pull out those key files?

for filename in os.listdir(directory):
    if (filename.startswith("D_")) and (filename.endswith(".tif") or filename.endswith(".tiff")):
        print(os.path.join(directory, filename))
    else:
        continue

What I want to do is something like: ...and (item in select_libr in filename) but I am not sure the correct syntax here. Any suggestions?

Upvotes: 0

Views: 47

Answers (1)

pho
pho

Reputation: 25500

You can use a regular expression to extract the number from Exxx and then do what you want with it. For example,

E(\d+)\.tiff?$ will match E, then one or more digits, then .tif, followed by an optional f at the end of the string. More importantly, it captures the digits as a group and allows us to pull just the digits out of the match object. Try it

for filename in os.listdir(directory):
    research = re.search(r"E(\d+)\.tiff?", filename)
    if research: # If there was a match
        fnum = research.group(1) # This is the string "0003", for example
        # Then do whatever you want with it
        if 0 <= int(fnum) <= 10:
            print(filename)

If you want to allow arbitrary values, I highly recommend using a set instead of a list to store those values, because checking for membership in a set is cheaper than in a list.

select_libr = {'E0001', 'E0002', 'E0003', 'E0003', 'E0005', 'E0006', 'E0007', 'E0008', 'E0009', 'E0010'}

And change the regex so that the E is also captured: (E\d+)\.tiff? Try it

for filename in os.listdir(directory):
    research = re.search(r"(E\d+)\.tiff?", filename)
    if research: # If there was a match
        fnum = research.group(1) # This is the string "E0003", for example
        # Then do whatever you want with it
        if fnum in select_libr:
            print(filename)

To ensure your filename starts with a D_, you can prepend ^D_.*? to the other regexes. This looks for a D_ at the start of the string, followed by as many of any character. Everything else can remain the same. Try it

Upvotes: 1

Related Questions