Alexis
Alexis

Reputation: 2304

Probabilities on a dictionary of lists in python

I have this dictionary of lists:

my_dict = {'Summer':['Summer','Summer','gone'],'gone':['forever'],'forever':['gone']}

And I want the probabilities for each word in the list as a dictionary, in this case the expected dictionary is:

my_dict_probs = {'Summer':{'Summer':0.66,'gone':0.33}, 'gone':{'forever':1}, 'forever'{'Summer':1}

So I have tried this:

prob_dict = {}
for k,v in my_dict.items():
  prob_dict[k] = v/len(v)
prob_dict

And I get this error: TypeError: unsupported operand type(s) for /: 'list' and 'int'. I guess that I should count per each unique value, so my approach is not working. Please, could you help me?

Upvotes: 2

Views: 286

Answers (3)

Patrick Artner
Patrick Artner

Reputation: 51653

Smallest change to your existing code:

my_dict = {'Summer':['Summer','Summer','gone'],'gone':['forever'],'forever':['gone']} 
    
prob_dict = {}
for k,v in my_dict.items():
    prob_dict[k] = {}                 # create inner dict
    for i in set(v):                  # for each element of the list, count it
        prob_dict[k][i] = v.count(i) / len(v)

print(prob_dict)

Output:

{'Summer': {'Summer': 0.6666666666666666, 'gone': 0.3333333333333333}, 
 'gone': {'forever': 1.0}, 
 'forever': {'gone': 1.0}}

This is less effective then using Counter because it iterates the inner lists once for each unique value. Counter accomplishes the same in 1 pass no matter how long the inner list.

But it does not need any imports and modifies your existent code the least.


To get closer to what Counter does you could do

prob_dict = {}
for k,v in my_dict.items():
    prob_dict[k] = {}
    partial = 1.0 / len(v)
    for i in v:
        prob_dict[k].setdefault(i,0)
        prob_dict[k][i] += partial

print(prob_dict)

which now does only iterate the inner list once BUT now you incour floating values inaccuracies.

Upvotes: 3

user15801675
user15801675

Reputation:

Try this

my_dict = {'Summer':['Summer','Summer','gone'],'gone':['forever'],'forever':['gone']}
for v in my_dict:
    my_dict[v]={j:round(my_dict[v].count(j)/len(my_dict[v]),2) for j in my_dict[v]}
print (my_dict)
    

Upvotes: 3

Corralien
Corralien

Reputation: 120429

Use Counter from itertools:

from collections import Counter

for k, v in my_dict.items():
    prob_dict[k] = {k1: v1 / len(v) for k1, v1 in Counter(v).items()}
>>> prob_dict
{'Summer': {'Summer': 0.6666666666666666, 'gone': 0.3333333333333333},
 'gone': {'forever': 1.0},
 'forever': {'gone': 1.0}}

Upvotes: 2

Related Questions