ganesh reddy
ganesh reddy

Reputation: 1892

Python dictionary

I am having some trouble understanding this, I have tried to reduce the problem to this set of code

for k in y.keys():
    if k in dateDict.keys():

        if yearDict[k] in dict1:
            dict1[yearDict[k]].extend(y[k])
        else:
            dict1[yearDict[k]] = y[k]

        if yearDict[k] in dict2:
            dict2[yearDict[k]].extend(y[k])
        else:
            dict2[yearDict[k]] = y[k]
    else:
        continue

I have two dictionaries y and dateDict to begin with. For a matching key for y in dateDict, I am populating two other dictionaries dict1 and dict2, hashed with keys from some other dictionary yearDict. Unfortunately the result are duplicated in dict1 and dict2, I have values repeating themselves. Any idea what could be happening?

Also I notice that this code works as expected,

for k in y.keys():
    if k in dateDict.keys():

        if yearDict[k] in dict1:
            dict1[yearDict[k]].extend(y[k])
        else:
            dict1[yearDict[k]] = y[k]
    else:
        continue

Upvotes: 0

Views: 114

Answers (1)

Rafael Lerm
Rafael Lerm

Reputation: 1400

If y[k] is a list (which it looks like), the same list will be assigned to everywhere where is it used. Dictionaries do not make copies of the elements when they are assigned, they just keep references to their objects. In your example, both keys in dict1 and dict2 will point to the same object.

Later, when it is modified, the same elements will be appended with the new values, once for each map. To prevent this, you can create a new list when initially assigning:

dictl[yearDict[k]] = list(y[k])

However, it is always good to know the Python standard library. This code could be made much more readable, and without the error, by using collections.defaultdict:

from collections import defaultdict

# This goes wherever the dictionaries 
# where initially defined.
dict1 = defaultdict(list)
dict2 = defaultdict(list)

# You can get the value here, no need to search it later.
for k, value in y.items():
    if k in dateDict.keys():
        # No need to call this everywhere.
        new_key = yearDict[k]

        # Note the defaultdict magic.
        dict1[new_key].extend(value)
        dict2[new_key].extend(value)

    # No need for the 'continue' at the end either.

When asked for a key that does not exist yet, the defaultdict will create a new one on the fly -- so you don't have to care about initialization, or creating copies of you values.

Upvotes: 1

Related Questions