Keebles
Keebles

Reputation: 13

initialising and incrementing nested dict python

I am sorry if this has been asked elsewhere, but have been trying and googling all day to solve this to no avail.

I wish to initialise an empty dict as such:

empty_dict = {}

Then, I take rows from a csv file, saved into a variable, lets say saved_word_list. In this saved_word_list are the rows from the csv file which contain sentences. Each of these sentences on the rows are identified as either A or B. What I would like to do is to populate the empty_dict with each unique word in the sentence so that each word is only counted once per line and added to the correct nested portion of the dict.

An example:

row_1 = {"this is a fine day to do a lot of coding"}

This row would be labelled A, so our list would change to:

empty_dict = {'this':{'A':1,'B':0}, 'is':{'A':1, 'B':0}, 'a':{'A':1, 'B':0}.....}

So, below is as much as I have, but I would like to get to understanding how to reach the goal of the line above. Any ideas how I can get to this point?

for (sentence, label) in zip(saved_word_list, labels):
    keys = set(labels)
    values = set(sentence.split())
    for key in keys:
      for value in values:
        if value not in empty_Dict:
          empty_Dict[value][key] = value
        else:
          empty_Dict[value][key] += 1

Upvotes: 0

Views: 326

Answers (1)

Farhaan S.
Farhaan S.

Reputation: 104

This should do exactly what you want.

def template_inner_dict():
    """
    Creates an template dictionary with the row labels as keys
    and 0 as values
    """
    return {i: 0 for i in labels}


empty_dict = {}
for (sentence, label) in zip(saved_word_list, labels):
    tokens = set(sentence.split())
    for token in tokens:
        # insert a template dict for the value not present in the dict
        if token not in empty_dict:
            empty_dict[token] = template_inner_dict()
        # increment the label count associated with token by 1
        empty_dict[token][label] += 1

Upvotes: 1

Related Questions