ShadyBears
ShadyBears

Reputation: 4185

KeyError when attempting to create a nested dictionary

Problem statement: The method receives a list of tuples. Each tuple consists of two items, an ID and a string. The instance variable search_criteria is a dictionary. The key is a group name and the values are a list of keywords to look for in every tuple and return the ID if found.

Example input:
results - (id, text-field)
search_criteria - (group name, keywords to search for)

results = [(1, "This is an example"), (2, "Another example"), (3, "Random String)] search_criteria = {"HR" : ["example", "harrassment", "fired"], "Maintenance" : ["is", "Random", "Cleaning"]}

Example output:

{
    "HR" : {"example": [1,2]}, 
    "Maintenance" : { "is" : [1], "Random" : [3]}
}

If a word is found, map the group to the keyword and the keyword to the list of ids found.

def build_keywords_found_dict(self, results):
    group_dict = {}

    for group in self.search_criteria:
        for keyword in self.search_criteria[group]:
            keyword_dict = {}
            for data in results:
                if keyword in data[1]:
                    group_dict[group] = keyword_dict[keyword].append(data[0])

    return group_dict

Current output:

KeyError

Upvotes: 1

Views: 727

Answers (3)

Shachar Langer
Shachar Langer

Reputation: 651

You get the KeyError exception when Python's interpreter try to run this line:

group_dict[group] = keyword_dict[keyword].append(data[0])

This exception is raised whenever you try to access a key that does not exist in a dictionary. In your example, keyword doesn't exist in keyword_dict and keyword_dict[keyword] is invalid (keyword_dict is always empty).

In addition, the append method in Python doesn't return anything so group_dict[group] will be equal to None.

The exact expected output of your problem isn't well defined (what happens if none of the keywords appear in the tuples?), but here's an optional solution without changing your code too much. I added a comment explaining every line I changed/added:

def build_keywords_found_dict(results):
    # Create an empty dict for each word in search_criteria
    group_dict = {word: {} for word in self.search_criteria} 

    for group in self.search_criteria:
        for keyword in self.search_criteria[group]:
            keyword_dict = {}
            for data in results:
                if keyword in data[1]:
                    # If the keyword doesn't exist in keyword_dict, add it with an empty list
                    keyword_dict.setdefault(keyword, [])
                    # Append the ID of the keyword to the list
                    keyword_dict[keyword].append(data[0])

            # If keyword_dict isn't empty, add it to group_dict
            if keyword_dict:
                group_dict[group][keyword] = keyword_dict[keyword]

    return group_dict

Upvotes: 0

blhsing
blhsing

Reputation: 106648

You can create a reverse mapping dict that maps words to their criteria, so that you can iterate through the words in each phrase and map the words to their criteria in linear time:

mapping = {i: k for k, l in search_criteria.items() for i in l}
output = {}
for id, words in results:
    for word in words.split():
        if word in mapping:
            output.setdefault(mapping[word], {}).setdefault(word, []).append(id)

output becomes:

{'Maintenance': {'is': [1], 'Random': [3]}, 'HR': {'example': [1, 2]}}

Upvotes: 1

Solomon Ucko
Solomon Ucko

Reputation: 6109

keyword_dict is always an empty dict. You never add anything to it. You might want to consider using defaultdict.

Upvotes: 0

Related Questions