Reputation: 323
My file structure looks like this:
- Outer folder
- Inner folder 1
- Files...
- Inner folder 2
- Files...
- …
I'm trying to count the total number of files in the whole of Outer folder. os.walk
doesn't return any files when I pass it the Outer folder, and as I've only got two layers I've written it manually:
total = 0
folders = ([name for name in os.listdir(Outer_folder)
if os.path.isdir(os.path.join(Outer_folder, name))])
for folder in folders:
contents = os.listdir(os.path.join(Outer_folder, folder))
total += len(contents)
print(total)
Is there a better way to do this? And can I find the number of files in an arbitrarily nested set of folders? I can't see any examples of deeply nested folders on Stack Overflow.
By 'better', I mean some kind of built in function, rather than manually writing something to iterate - e.g. an os.walk
that walks the whole tree.
Upvotes: 1
Views: 2847
Reputation: 62513
pathlib
:os
because it treats paths as objects with methods, not strings to be sliced.[x.parent for x in f if x.is_file()]
from pathlib import Path
import numpy as np
p = Path.cwd() # if you're running in the current dir
# p = Path('path to to dir') # otherwise, specify a path
# creates a generator of all the files matching the pattern
f = p.rglob('*')
# optionally, use list(...) to unpack the generator
# f = list(p.rglob('*'))
# counts them
paths, counts = np.unique([x.parent for x in f], return_counts=True)
path_counts = list(zip(paths, counts))
[(WindowsPath('E:/PythonProjects/stack_overflow'), 8),
(WindowsPath('E:/PythonProjects/stack_overflow/.ipynb_checkpoints'), 7),
(WindowsPath('E:/PythonProjects/stack_overflow/complete_solutions/data'), 6),
(WindowsPath('E:/PythonProjects/stack_overflow/csv_files'), 3),
(WindowsPath('E:/PythonProjects/stack_overflow/csv_files/.ipynb_checkpoints'), 1),
(WindowsPath('E:/PythonProjects/stack_overflow/data'), 5)]
f = list(p.rglob('*'))
unpacks the generator and produces a list of all the files.Path.cwd().rglob('*')
or Path('some path').rglob('*')
path_counts = list(zip(*np.unique([x.parent for x in Path.cwd().rglob('*')], return_counts=True)))
Upvotes: 7
Reputation: 23
I will suggest you use recursion as the function below:
def get_folder_count(path):
folders = os.listdir(path)
folders = list(filter(lambda a: os.path.isdir(os.path.join(path, a)), folders))
count = len(folders)
for i in range(count):
count += get_folder_count(os.path.join(path, folders[i]))
return count
Upvotes: 1