rjbogz
rjbogz

Reputation: 870

getting directory size of subdirectory in python

I have a directory that I am walking with multiple subdirectories in it and possibly subdirectories within the subdirectories.

Folder
+-Sub1
| +-SubSub1
| +-File1
+-Sub2
| +-File2
+-Sub3
| +-File3
| +-File4
+-Sub4
  +-File5
  +-SubSub2
    +-File6

I would like to get the size of each subfolder (Sub1, Sub2, etc) in it's entirety. I also need to get the name of each folder. For example:

Sub1 is 34 MB
Sub2 is 2893 MB
...

I currently have the following:

for r, d, f in os.walk(directory):
    size = sum(getsize(join(r,n)) for n in f) / 1048576
    print size
    for s in d:
        print s

which prints out all the sizes followed by all of the directory names because they are each in separate for loops. How can I print it as stated above?

Upvotes: 2

Views: 4445

Answers (3)

blvb
blvb

Reputation: 129

I found this question looking for something similar. I got inspired by the accepted solution by rjbogz and the answer by Robᵩ and made this. It provides the size of all content of a folder including all the content of the subfolders etc..

import os
def get_size(source, total_size):
    total_size_in = total_size
    for item in os.listdir(source):
        itempath = os.path.join(source, item)
        if os.path.isfile(itempath):
            total_size += os.path.getsize(itempath)
        elif os.path.isdir(itempath):
            total_size += get_size(itempath, total_size)
    return total_size - total_size_in


def walk_recursive(directory, level):
    for d in next(os.walk(directory))[1]:
        itempath = os.path.join(directory, d)
        size = get_size(itempath, total_size=0)
        if level == 0:
            path = d
        else:
            path = '\\' + d
        print('    '*level +
              '{:6.2f}'.format(float(size) / 1048576) + ' MB  ' + path)
        path = os.listdir(directory)[0]
        if path:
            for f in [path]:
                itempath2 = os.path.join(directory, f)
                if os.path.isdir(itempath2):
                    walk_recursive(itempath, level + 1)


directory = r"<put_your_dir_path_here>"
walk_recursive(directory, level=0)

Upvotes: 2

rjbogz
rjbogz

Reputation: 870

I ended up creating the following function:

def get_size(source):
    total_size = 0
    total_size = os.path.getsize(source)
    for item in os.listdir(source):
        itempath = os.path.join(source, item)
        if os.path.isfile(itempath):
            total_size += os.path.getsize(itempath)
        elif os.path.isdir(itempath):
            total_size += get_size(itempath)
    return float(total_size) / 1048576

and then calling it in my for loop:

for d in os.walk(directory).next()[1]:
    size = get_size(directory+d)
    print d + ' is ' + str(size) ' MB'

Upvotes: 0

Robᵩ
Robᵩ

Reputation: 168596

As a first step, try this:

import os
for r, d, f in os.walk('.'):
    size = sum(os.path.getsize(os.path.join(r,n)) for n in f) / 1048576
    print "{} is {}".format(r, size)

On my PC, the result is this:

. is 1
./Sub4 is 1
./Sub4/SubSub2 is 1
./Sub3 is 2
./Sub2 is 1
./Sub1 is 1
./Sub1/SubSub1 is 0

This will at least print the directory names next to the associated sizes.

As the next step, you'll need to find a way to sum the subordinate sizes into the size of the parent directory. In this example, I use a dictionary to remember the sizes of the sub directories:

import os
dir_sizes = {}
for r, d, f in os.walk('.', False):
    size = sum(os.path.getsize(os.path.join(r,f)) for f in f+d)
    size += sum(dir_sizes[os.path.join(r,d)] for d in d)
    dir_sizes[r] = size
    print "{} is {} MB".format(r, size/2**20)

Result (each FileN is 1 megabyte):

./Sub4/SubSub2 is 1 MB
./Sub4 is 2 MB
./Sub3 is 2 MB
./Sub2 is 1 MB
./Sub1/SubSub1 is 0 MB
./Sub1 is 1 MB
. is 6 MB

Upvotes: 1

Related Questions