TIF
TIF

Reputation: 323

How to count total number of files in each subfolder

My file structure looks like this:

- Outer folder
  - Inner folder 1
    - Files...
  - Inner folder 2
    - Files...
  - …

I'm trying to count the total number of files in the whole of Outer folder. os.walk doesn't return any files when I pass it the Outer folder, and as I've only got two layers I've written it manually:

total = 0
folders = ([name for name in os.listdir(Outer_folder)
            if os.path.isdir(os.path.join(Outer_folder, name))])
for folder in folders:
    contents = os.listdir(os.path.join(Outer_folder, folder))
    total += len(contents)
print(total)

Is there a better way to do this? And can I find the number of files in an arbitrarily nested set of folders? I can't see any examples of deeply nested folders on Stack Overflow.

By 'better', I mean some kind of built in function, rather than manually writing something to iterate - e.g. an os.walk that walks the whole tree.

Upvotes: 1

Views: 2847

Answers (2)

Trenton McKinney
Trenton McKinney

Reputation: 62513

Use pathlib:

File and subdirectory count in each directory:

from pathlib import Path
import numpy as np

p = Path.cwd()  # if you're running in the current dir
# p = Path('path to to dir')  # otherwise, specify a path 

# creates a generator of all the files matching the pattern
f = p.rglob('*')
# optionally, use list(...) to unpack the generator
# f = list(p.rglob('*'))

# counts them
paths, counts = np.unique([x.parent for x in f], return_counts=True)

path_counts = list(zip(paths, counts))

Output:

  • List of tuples with path and count
[(WindowsPath('E:/PythonProjects/stack_overflow'), 8),
 (WindowsPath('E:/PythonProjects/stack_overflow/.ipynb_checkpoints'), 7),
 (WindowsPath('E:/PythonProjects/stack_overflow/complete_solutions/data'), 6),
 (WindowsPath('E:/PythonProjects/stack_overflow/csv_files'), 3),
 (WindowsPath('E:/PythonProjects/stack_overflow/csv_files/.ipynb_checkpoints'), 1),
 (WindowsPath('E:/PythonProjects/stack_overflow/data'), 5)]
  • f = list(p.rglob('*')) unpacks the generator and produces a list of all the files.

One-liner:

  • Use Path.cwd().rglob('*') or Path('some path').rglob('*')
path_counts = list(zip(*np.unique([x.parent for x in Path.cwd().rglob('*')], return_counts=True)))

Upvotes: 7

Samuel
Samuel

Reputation: 23

I will suggest you use recursion as the function below:

def get_folder_count(path):
    folders = os.listdir(path)
    folders = list(filter(lambda a: os.path.isdir(os.path.join(path, a)), folders))
    count = len(folders)
    for i in range(count):
        count += get_folder_count(os.path.join(path, folders[i]))
    return count

Upvotes: 1

Related Questions