willrobertshaw
willrobertshaw

Reputation: 5

Having trouble with iterating over dictionary in Python

I have defined a class Lexicon:

class Lexicon:
    """stores known word stems of various part-of-speech categories"""

    def __init__ (self):
        self.catDict = {}

    def add(self,stem,cat):
        for k, v in self.catDict.iteritems():
            if (k != cat and v != stem):
                self.catDict[cat] = stem

When I execute the Lexicon.add() method I want it to take a word e.g. "John" and the category of that word e.g. "P" so it could look something like this:

Lexicon.add("John","P")

I want this to be stored in the catDict dictionary. But only if there doesn't already exist a 'P':'John' in the dictionary, my problem seems to be occurring with the for loop and if statement.

When testing without the for loop and if statement my code works. But when I have the for loop and if statement in my code to filter out any duplicate entries it leaves me with an empty dictionary. Here is the terminal transcript when I test it with the for loop and if statement present:

>>> from statements import Lexicon
>>> lx = Lexicon()
>>> lx.catDict
{}
>>> lx.add("John","P")
>>> lx.catDict
{}
>>> 

Upvotes: 0

Views: 72

Answers (3)

salomonvh
salomonvh

Reputation: 1809

Have you tried testing for the key in the dict?

if some_key not in self.catDict.keys()
    self.catDict[somekey] = someValue

Upvotes: 0

RedX
RedX

Reputation: 15175

Normally dictionaries only have one way of looking up items, what you are doing would be more suitable for a tuple.

But since i don't know exactly what you are trying to do here is one possible solution:

def add(self,stem,cat):        
    """ Only adds stem if cat is not present. """        
    if not cat in self.catDict: # the proper way to look up an item in a dict
        self.catDict[cat] = stem

Upvotes: 1

Thayne
Thayne

Reputation: 6992

What your code does is loop over every entry in the dictionary (keep in mind that initially it is empty, so nothing will happen), then for each entry if the key or value doesn't match your input, you store stem in self.catDict[cat]. Do you see the problem?

In fact there are two problems: 1. Since the dictionary is initially empty, the for loop is essentially a no-op the first time, so the dictionary stays empty, and the add method does nothing. 2. Even if you have something in the dictionary, you do the comparision on every iteration, so basically what you are doing is adding the entry as long as there is at least one entry in the dictionary that isn't the same as the entry you are adding.

However, your condition that it is added "only if there doesn't exist a 'P':'John' in the dictionary is already provided by the dict class. A dict only ever has one entry with a given key, so if you execute self.catDict['P'] = 'John' and 'P':'John' is already in the dictionary, you will still only have one 'P':'John' in the dictionary.

EDIT:

My guess is that what you really want is a way to keep track of a dictionary with categories as keys and sets of stems as values. For this, a combination of defaultdict and set is perfect:

from collections import defaultdict
class Lexicon:
    """stores known word stems of various part-of-speech categories"""

    def __init__ (self):
        self.catDict = defaultdict(set)

    def add(self,stem,cat):
        self.catDict[cat].add(stem)

The way this works is catDict is a defaultdict which is a dictionary that will call the function passed to it to construct a new value if an attempt is made to access a key that has not previously been set. In the add method we retrieve the value for a category with sefl.catDict[cat] if we have already stored something for that category, the previous set will be returned, if not a new set will be created and automatically set to self.catDict[cat]. Then we add the stem to that set. Because sets only contain distinct values, stem will only actually be added to the set if it is not already contained in the set.

Upvotes: 1

Related Questions