pufAmuf
pufAmuf

Reputation: 7805

How to loop through files of certain extensions?

I'm trying to loop through a folder and all subfolders to find all files of certain file types - for example, only .mp4, .avi, .wmv.

Here is what I have now, it loops through all file types:

import os
rootdir = 'input'

for subdir, dirs, files in os.walk(rootdir):
     for file in files:
          print (os.path.join(subdir, file))

Warning: above code sample will run indefinitely and may yield out of memory

Upvotes: 28

Views: 30204

Answers (5)

vaibhav singh
vaibhav singh

Reputation: 183

This one line solution might also be useful to get all .py file in present directory

for file in list(filter(lambda x: x.endswith('.py'), os.listdir('./'))):
    print(file) 

Upvotes: 2

Mauricio De Diana
Mauricio De Diana

Reputation: 139

Since Python 3.4 you can use pathlib:

from pathlib import Path
from itertools import chain

rootdir = 'input'
p = Path(rootdir)
for file in (chain(p.glob('**/*.mp4'), p.glob('**/*.avi'))):
    print(file)

Upvotes: 1

Sam Redway
Sam Redway

Reputation: 8127

I actually did something similar to this a couple of days ago and here is how I did it:

EXTENSIONS = ('.cpp','.hpp')

for root, dirs, files in os.walk(top):
    for file in files:
        if file.endswith(EXTENSIONS):
            #file which ends with extension type so do your thing!

Hope this is what you are after. You can see the whole script here on my github.

Upvotes: 4

Padraic Cunningham
Padraic Cunningham

Reputation: 180401

For multiple extensions, the simplest is just to use str.endswith passing a tuple of substrings to check:

  for file in files:
      if file.endswith((".avi",".mp4","wmv")):
         print (os.path.join(subdir, file))

You could use iglob like below and chain the searches returned or use re.search but using endswith is probably the best approach.

from itertools import chain
from glob import iglob

for subdir, dirs, files in os.walk(rootdir):
    for file in chain.from_iterable(iglob(os.path.join(rootdir,p)) for p in ("*.avi", "*.mp4", "*wmv")) :
            print(os.path.join(subdir, file))

Using python3.5 glob now supports recursive searches with the ** syntax:

from itertools import chain
from glob import iglob

from glob import iglob
for file in chain.from_iterable(iglob(os.path.join(rootdir,p)) 
      for p in (rootdir+"**/*.avi", "**/*.mp4", "**/*wmv")):
          print(file)

Upvotes: 29

Ozgur Vatansever
Ozgur Vatansever

Reputation: 52143

You can use os.path.splitext which takes a path and splits the file extension from the end of it:

import os
rootdir = 'input'
extensions = ('.mp4', '.avi', '.wmv')

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        ext = os.path.splitext(file)[-1].lower()
        if ext in extensions:
            print (os.path.join(subdir, file))

Upvotes: 22

Related Questions