Reputation: 5282
I want to know if is possible to use glob.glob("**/*.jpg")
in order to get all images in several folders but as an iterator in order to avoid filling the memory
Currently, I am using the following code with glob:
for file in glob.glob("**/*.jpg")[:1]:
print(file)
but I use
for model_folder in os.scandir(folder):
for model_folder_content in os.scandir(model_folder):
print(model_folder_content)
The problem with the first approach is that if there are a lot of files that can fill the memory and fails, so the idea is to use scandir because return an iterator, but with the option of using a pattern.
Is this possible?
Thanks
Upvotes: 2
Views: 461
Reputation: 5372
You can use pathlib.Path.rglob()
, which returns a generator
:
>>> from pathlib import Path
>>> folder = Path('/home/accdias')
>>> jpgs = folder.rglob('*.jpg')
>>> type(jpgs)
<class 'generator'>
>>>
Upvotes: 2
Reputation: 1250
The glob module has a dedicated method for this particular problem called iglob()
which takes the same parameters as glob()
and returns an iterator instead of a list.
The docs for iglob
state the following:
Return an iterator which yields the same values as glob() without actually storing them all simultaneously.
In your case, the code snippet could look something like:
for file in glob.iglob("**/*.jpg"):
# do something with the file
Upvotes: 1
Reputation: 14253
You can use glob.iglob():
glob.iglob(pathname, *, recursive=False) Return an iterator which yields the same values as glob() without actually storing them all simultaneously.
Upvotes: 5