Fabian Brunner
Fabian Brunner

Reputation: 21

How to create a list of dictionaries in python

I'm relatively new to Python, actually to programming as a whole. Unfortunately, I have not been able to find an answer to my question on the forum yet.

I have a list with different file extensions, the file extensions occur multiple times. See example here:

extensions = ["JPG", "XLSX", "MP3", "PDF", "EXE", "PY", "XLSX", "DOCX", "JPG", "PPTX"]

I want to create a new list of dictionaries using the above list. It should look like this:

dicts = [{"Extension": "py", "Count": 1}, {"Extension": "docx", "Count": 1}]

My plan is to iterate over the list and to append the file extension to the new list as a new dictionary as shown in the line of code above. If the extension already exists as a dictionary in the list of dictionaries, only the index ["Count"] of the matching dictionary should be incremented with +=1. I have written the following code, but it does not work.

I know that the empty extensionlist within the function is one problem, but still I don't get it to work as intended. I would appreciate any help.

extensions = ["JPG", "XLSX", "MP3", "PDF", "EXE", "PY", "XLSX", "DOCX", "JPG", "PPTX"]


def get_extensions(extensions):
    extensionlist = []
    for item in extensions:
        extension = item.lower()
        for dictionary in extensionlist:
            if dictionary["Extension"] == extension:
                dictionary["Count"] += 1
                break
            else:
                extensionlist.append({"Extension": extension, "Count": 1})
                break
    return extensionlist


test = get_extensions(extensions)
print(test)

Upvotes: 2

Views: 102

Answers (2)

crcvd
crcvd

Reputation: 1535

You can build the frequency table with a Counter and then iterate over that to construct your list:

from collections import Counter

extensions = ["JPG", "XLSX", "MP3", "PDF", "EXE", "PY", "XLSX", "DOCX", "JPG", "PPTX"]

frequencies = Counter(extensions)

# Build a list of dicts using a list comprehension. Not 
# really sure why you'd want it in this format (rather 
# than a dictionary).
output = [
    { "Extension": ext.lower(), "Count": freq }
    for ext, freq in frequencies.items()
]

If you wanted to do this "manually" using a for loop, I'd suggest a similar approach: first construct a dictionary of extension keys to frequency counts, and then construct the list:

frequencies = {}

for extension in extensions:
    # d.get(key, default) is like [], except it
    # returns default if key is not in d (rather than 
    # throwing a KeyError).
    frequencies[extension] = frequencies.get(extension, 0) + 1

# This is less idiomatic than the list comprehension 
# shown above, but it's the same end result.
output = []
for extension, frequency in frequencies.items():
    output.append(...)

This is better than your double for loop because it's one pass over extensions and then a second pass over frequencies. Even if your current implementation worked, you're doing a linear scan over the list every time you need to determine whether it already contains a specific extension (so, in the worst-case scenario, you need to check 1, 2, ..., n elements in the output list for your m extensions).

Upvotes: 5

maziyank
maziyank

Reputation: 616

Your code almost gets right. The problem is that you never reach else statement. Just unindent the else part in your code and its works.

extensions = ["JPG", "XLSX", "MP3", "PDF", "EXE", "PY", "XLSX", "DOCX", "JPG", "PPTX"]


def get_extensions(extensions):
    extensionlist = []
    for item in extensions:
        extension = item.lower()
        for dictionary in extensionlist:
            if dictionary["Extension"] == extension:
                dictionary["Count"] += 1
                break
        else:
            extensionlist.append({"Extension": extension, "Count": 1})
            
    return extensionlist


test = get_extensions(extensions)
print(test)

Upvotes: 2

Related Questions