Sascha
Sascha

Reputation: 13

Python for loop appends only the last list as value

I am looping through a directory and want to get all files in a folder stored as list in a dictionary, where each key is a folder and the list of files the value.

The first print in the loop shows exactly the output I am expecting.

However the second print shows empty values.

The third print after initialization of the class shows the list of the last subfolder as value for every key.

What am I overlooking or doing wrong?

class FileAndFolderHandling() :

    folders_and_files = dict()


    def __init__(self) :
        self.getSubfolderAndImageFileNames()


    def getSubfolderAndImageFileNames(self) :

        subfolder = ""
        files_in_subfolder = []

        for filename in glob.iglob('X:\\Some_Directory\\**\\*.tif', recursive=True) :

            if not subfolder == os.path.dirname(filename) and not subfolder == "" :
                print(subfolder + "  /  /  " + str(files_in_subfolder))
                self.folders_and_files[subfolder] = files_in_subfolder   
                files_in_subfolder.clear()
                print(self.folders_and_files)

            subfolder = os.path.dirname(filename) # new subfolder
            files_in_subfolder.append(os.path.basename(filename))



folder_content = FileAndFolderHandling()

print(folder_content.folders_and_files)

Upvotes: 1

Views: 1357

Answers (3)

quamrana
quamrana

Reputation: 39354

It sounds like you are after defaultdict.

I adapted your code like this:

import glob, os
from collections import defaultdict

class FileAndFolderHandling() :
    folders_and_files = defaultdict(list)

    def __init__(self) :
        self.getSubfolderAndImageFileNames()

    def getSubfolderAndImageFileNames(self) :
        for filename in glob.iglob(r'C:\Temp\T\**\*.txt', recursive=True) :
            # print(filename)
            subfolder = os.path.dirname(filename)
            self.folders_and_files[subfolder].append(os.path.basename(filename))


folder_content = FileAndFolderHandling()

print(dict(folder_content.folders_and_files))

Output:
{'C:\\Temp\\T': ['X.txt'], 'C:\\Temp\\T\\X': ['X1.txt', 'X2.txt'], 'C:\\Temp\\T\\X2': ['X1.txt']}

The defaultdict(list) makes a new list for every new key added. This is what you seems to want to happen in your code.

Upvotes: 1

Ben Brown
Ben Brown

Reputation: 329

You are clearing the array, from what I see...

files_in_subfolder.clear()

Remove that and make sure your value gets added to the folders_and_files variable before any clear operation.

Upvotes: 1

Rodrigo De Rosa
Rodrigo De Rosa

Reputation: 86

It seems like the problem you have is that you are actually using always the same list.

Defining files_in_subfolder = [] creates a list and assigns a pointer to that list in the variable you just defined. So what happens then is that when you assign self.folders_and_files[subfolder] = files_in_subfolder you are only storing the pointer to your list (which is the same in every iteration) in the dictionary and not the actual list.

Later, when you do files_in_subfolder.clear() you are clearing the list to which that pointer was pointing to, and therefore to all the entries of the dictionary (as it was always the same list).

To solve this, I would recommend you to create a new list for each different entry in your dictionary, instead of clearing it for each iteration. This is, move the definition of files_in_subfolder from outside the loop to inside of it.

Hope it helps!

Upvotes: 2

Related Questions