Reputation: 26315
I have a default dict which looks like this:
my_dict = default(dict, {"K": {"k": 2, "x": 1.0}, "S": {"_":1.0, "s":1}, "EH": {"e":1.0}})
The keys are phonemes, and values that are dictionaries themselves are graphemes which occur a certain amount of times, which are the respective numbers in the default dict.
The function should return another default dict containing the probabilities, which will look like this:
defaultdict(<class 'dict'>, {'EH': {'e': 1.0}, 'K': {'k': 0.6666666666666666, 'x': 0.3333333333333333}, 'S': {'_': 0.5, 's': 0.5}})
'e' remains the same, as 1.0/1 = 1.0. 'K' has values of 0.66666 and 0.33333 because 2/3 = 0.66666 and 1/3 = 0.3333333. 'S' has values of 0.5 and 0.5, because 1/2=0.5 for each of them. The probabilities in the return dict must always sum to one.
so far I have this:
from collections import defaultdict
my_dict = default(dict, {"K": {"k": 2, "x": 1.0}, "S": {"_":1.0, "s":1}, "EH": {"e":1.0}})
def dict_probability(my_dict):
return_dict = defaultdict(dict)
for key, value in my_dict.items():
for k, v in values.items():
I would also like to make it work for default dict that looks like this:
dict_two = defaultdict(dict, {('EH', 't'): {'e': 2}, ('N', 'e'): {'ne': 1, 'n': 2}})
Which has keys that just have characters in them, I would like them to just be returned the same.
I'm just not sure how I should do this properly Any help would be appreciated.
I would also like to do this for everytime I call the function:
>>>my_dict = default(dict, {"K": {"k": 2, "x": 1.0}, "S": {"_":1.0, "s":1}, "EH": {"e":1.0}})
>>>dict_probability(my_dict)
>>>print(m_dict)
defaultdict(<class 'dict'>, {'EH': {'e': 1.0}, 'K': {'k': 0.6666666666666666, 'x': 0.3333333333333333}, 'S': {'_': 0.5, 's': 0.5}})
I would like the dict_probability function to return None also
Upvotes: 1
Views: 97
Reputation: 54163
You'll basically want to sum values, then divide by each subkey's individual values.
result = defaultdict(dict)
for bigkey, d in yourdict.values():
# bigkey="K", d={"k": 2, "x": 1.0}, ...
total = sum(d.values())
# d.values() == [2, 1.0]
for k,v in d.items():
# k="k", v=2, ...
result[bigkey][k] = v / total
# result["K"]["k"] = 3 / 2
This could be done all in one really ugly dict comp, if you have no regard for future programmers.
result = defaultdict(dict).update({bigkey: {k: v / sum(d.values()) for k,v in d.items()} for bigkey,d in yourdict.items()})
Upvotes: 1