johankent30
johankent30

Reputation: 75

How to count the number of files in subdirectories?

I have the following file structure and would like to use python to create a dictionary of the number of files in each folder. The example at the bottom would translate to the following dictionary:

{Employee A: {Jan : 3}, {Feb : 2}, Employee B: {Jan : 2}, {Feb : 1}}

Does anyone know how to iterate over the directory using os to do this?

Employee A
    Jan
        File 1
        File 2
        File 3
    Feb
        File 1
        File 2
Employee B
    Jan
        File 1
        File 2
    Feb
        File 1

Upvotes: 1

Views: 1107

Answers (4)

martineau
martineau

Reputation: 123463

With a few minor adjustments the ActiveState Python recipe Create a nested dictionary from os.walk can be made to do what you want:

try:
    reduce
except NameError:  # Python 3
    from functools import reduce
import os

def count_files_in_directories(rootdir):
    """ Creates a nested dictionary that represents the folder structure
        of rootdir with a count of files in the lower subdirectories.
    """
    dir = {}
    rootdir = rootdir.rstrip(os.sep)
    start = rootdir.rfind(os.sep) + 1
    for path, dirs, files in os.walk(rootdir):
        folders = path[start:].split(os.sep)
        subdir = len(files) if files else dict.fromkeys(files)
        parent = reduce(dict.get, folders[:-1], dir)
        parent[folders[-1]] = subdir

    return list(dir.values())[0]

startdir = "./sample"
res = count_files_in_directories(startdir)
print(res)  # -> {'Employee A': {'Feb': 2, 'Jan': 3}, 'Employee B': {'Feb': 1, 'Jan': 2}}

Note the ./sample directory is the root directory of a folder structure I created for testing that's exactly like the one shown in your question.

Upvotes: 0

John
John

Reputation: 2127

Look into parsing the output from os.walk

For example:

mydict = {}
for (root,dirs,files) in os.walk('testdir', topdown=False)
    if len(files)>0:
        mydict[root]=len(files)
print mydict

returns

{'testdir/EmployeeB/Jan': 2, 'testdir/EmployeeA/Feb': 2, 'testdir/EmployeeB/Feb': 1, 'testdir/EmployeeA/Jan': 3}

You could pretty easily parse those keys to generate the nested dictionary that you're looking for.

Upvotes: 3

Arya11
Arya11

Reputation: 568

use os library:

import os
parent = os.listdir(path) # return directory files to list
child = []
for x in parent:
    if os.path.isdir(path +'/' + x):
        child.append(os.listdir(path + '/' + x))
    else
        child.append('')
d = dict(zip(parent,child))
print(d)

this is the basic logic of making a dictionary out of directories. however this supports 2 levels. I'll leave the n-level part to yourself.

Upvotes: 0

clearshot66
clearshot66

Reputation: 2302

Something like this would let you iterate over all files in a directory and create a list of them. You can modify it as needed:

import os
import glob
from pathlib import Path

error_log_list = []

def traverse_structure():

  try:
    root = r"C:\\Users\Whatever\Desktop\DirectoryToSearch"
    # Change working directory
    os.chdir(root)

    print("Creating master list of the directory structure...\n")

    # Traverse the folder structure
    for folder, subfolders, files in os.walk(root):

      # Pass over each file
      for file in files:

        absolute_path = os.path.join(folder,file)

        # Create a master file list
        file_paths_list.append(absolute_path)

  except Exception as e:
    error_log_list.append( "Failed to open the root directory specified "+root+"\n Error: "+str(e)+"\n" )

traverse_structure()

Upvotes: 0

Related Questions