FanaticD
FanaticD

Reputation: 1467

Generating dictionary of lists in a loop (3.2.3)

I need to create a dictionary where will be key a string and a value a list. The trick is I need to do it in a loop.

My minimalised code looks like this at the moment:

for elem in xmlTree.iter():

  # skipping root element
  if elem.tag == xmlTree.getroot().tag:
    continue
  # this is supposed to be my temporary list
  tmpList = []
  for child in elem:
      tableWColumns[elem.tag] = tmpList.append(child.tag)
print(tableWColumns)

This prints only the list created in the last iteration.

Problem apparently lies in the fact that whenever I change the list, all of its references are changed as well. I Googled that. What I haven't Googled though is the way how can I deal with it when using a loop.

The solution I am supposed to use when I want to keep the list is to copy it to some other list and then I can change the original one without losing data. What I don't know is how do I do it, when I basically need to do this dynamically.

Also I am limited to use of standard libraries only.

Upvotes: 0

Views: 84

Answers (2)

Kasravnd
Kasravnd

Reputation: 107307

The problem is because of that you are creating the tmpList = [] list in each iteration and put it [].So python replace the new with older in each iteration, thus you see the last iteration result in your list.

Instead you can use collections.defaultdict :

from collections import defaultdict
d=defaultdict(list)

for elem in xmlTree.iter():
  # skipping root element
  if elem.tag == xmlTree.getroot().tag:
    continue
  # this is supposed to be my temporary list
  for child in elem:
      d[elem.tag].append(child.tag)
print(tableWColumns)

Or you can use dict.setdefault method :

d={}
for elem in xmlTree.iter():
  # skipping root element
  if elem.tag == xmlTree.getroot().tag:
    continue
  # this is supposed to be my temporary list
  for child in elem:
      d.setdefault(elem.tag,[]).append(child.tag)
print(tableWColumns)

Also note as @abarnert says tmpList.append(child.tag) will return None.so after assignment actually python will assign None to tableWColumns[elem.tag].

Upvotes: 3

abarnert
abarnert

Reputation: 365825

The big problem here is that tmpList.append(child.tag) returns None. In fact, almost all mutating methods in Python return None.

To fix that, you can either do the mutation, then insert the value in a separate statement:

for child in elem:
    tmpList.append(child.tag)
tableWColumns[elem.tag] = tmpList

… or not try to mutate the list in the first place. For example

tableWColumns[elem.tag] = tmpList + [child.tag for child in elem]

That will get rid of your all-values-are-None problem, but then you've got a new problem. If any tag appears more than once, you're only going to get the children from the last copy of that tag, not from all copies. That's because you build a new list each time, and reassign tableWColumns[elem.tag] to that new list, instead of modifying whatever was there.

To solve that problem, you need to fetch the existing value into tmpList instead of creating a new one:

tmpList = tableWColumns.get(elem.tag, [])
tableWColumns[elem.tag] = tmpList + [child.tag for child in elem]

Or, as Kasra's answer says, you can simplify this by using a defaultdict or the setdefault method.

Upvotes: 1

Related Questions