RoadRunner
RoadRunner

Reputation: 26315

Accessing dictionary within dictionary

I have a default dict which looks like this:

my_dict = default(dict, {"K": {"k": 2, "x": 1.0}, "S": {"_":1.0, "s":1}, "EH": {"e":1.0}})

The keys are phonemes, and values that are dictionaries themselves are graphemes which occur a certain amount of times, which are the respective numbers in the default dict.

The function should return another default dict containing the probabilities, which will look like this:

defaultdict(<class 'dict'>, {'EH': {'e': 1.0}, 'K': {'k': 0.6666666666666666, 'x': 0.3333333333333333}, 'S': {'_': 0.5, 's': 0.5}})

'e' remains the same, as 1.0/1 = 1.0. 'K' has values of 0.66666 and 0.33333 because 2/3 = 0.66666 and 1/3 = 0.3333333. 'S' has values of 0.5 and 0.5, because 1/2=0.5 for each of them. The probabilities in the return dict must always sum to one.

so far I have this:

from collections import defaultdict   

my_dict = default(dict, {"K": {"k": 2, "x": 1.0}, "S": {"_":1.0, "s":1}, "EH": {"e":1.0}})

def dict_probability(my_dict):

   return_dict = defaultdict(dict)

   for key, value in my_dict.items():
       for k, v in values.items():

I would also like to make it work for default dict that looks like this:

    dict_two = defaultdict(dict, {('EH', 't'): {'e': 2}, ('N', 'e'): {'ne': 1, 'n': 2}})

Which has keys that just have characters in them, I would like them to just be returned the same.

I'm just not sure how I should do this properly Any help would be appreciated.

I would also like to do this for everytime I call the function:

    >>>my_dict = default(dict, {"K": {"k": 2, "x": 1.0}, "S": {"_":1.0, "s":1}, "EH": {"e":1.0}})
    >>>dict_probability(my_dict)
    >>>print(m_dict)
    defaultdict(<class 'dict'>, {'EH': {'e': 1.0}, 'K': {'k': 0.6666666666666666, 'x': 0.3333333333333333}, 'S': {'_': 0.5, 's': 0.5}})

I would like the dict_probability function to return None also

Upvotes: 1

Views: 97

Answers (1)

Adam Smith
Adam Smith

Reputation: 54163

You'll basically want to sum values, then divide by each subkey's individual values.

result = defaultdict(dict)

for bigkey, d in yourdict.values():
    # bigkey="K", d={"k": 2, "x": 1.0}, ...
    total = sum(d.values())
    # d.values() == [2, 1.0]
    for k,v in d.items():
        # k="k", v=2, ...
        result[bigkey][k] = v / total
        # result["K"]["k"] = 3 / 2

This could be done all in one really ugly dict comp, if you have no regard for future programmers.

result = defaultdict(dict).update({bigkey: {k: v / sum(d.values()) for k,v in d.items()} for bigkey,d in yourdict.items()})

Upvotes: 1

Related Questions