Reputation: 853
I'n being warned that this question has been frequently downvoted, but I haven't seen a solution for my particular problem.
I have a dictionary that looks like this:
d = {'a': [['I', 'said', 'that'], ['said', 'I']],
'b':[['she', 'is'], ['he', 'was']]}
I would like for the output to be a dictionary with the original keys and then a dictionary containing a value that indicates the count for each of the words (e.g., {'a':{'I':2, 'said':2, 'that':1}
and so on with b.
If the values were in a list instead of a sublist, I could get what I wanted just by using Counter
:
d2 = {'a': ['I','said','that', 'I'],'b': ['she','was','here']}
from collections import Counter
counts = {k: Counter(v) for k, v in d2.items()}
However, I'm getting TypeError: unhashable type: 'list'
because the lists containing the values I want to count are sublists and the list that contains them isn't hashable.
I also know that if I just had sublists, I could get what I want with something like:
lst = [['I', 'said', 'that'], ['said', 'I']]
Counter(word for sublist in lst for word in sublist)
But I just can't figure out how to combine these ideas to solve my problem (and I guess it lies in combining these two).
I did try this
for key, values in d.items():
flat_list = [item for sublist in values for item in sublist]
new_dict = {key: flat_list}
counts = {k: Counter(v) for k, v in new_dict.items()}
But that only gives me the counts for the second list (because the flat_list itself only returns the value for the second key.
Upvotes: 0
Views: 137
Reputation: 595
Use both itertools
and collections
modules for this. Flatten the nested lists with itertools.chain
and count with collections.Counter
import itertools, collections
d = {
'a': [['I', 'said', 'that'], ['said', 'I']],
'b':[['she', 'is'], ['he', 'was']]
}
out_dict = {}
for d_key, data in d.items():
counter = collections.Counter(itertools.chain(*data))
out_dict[d_key] = counter
print out_dict
Output:
{'a': Counter({'I': 2, 'said': 2, 'that': 1}),
'b': Counter({'she': 1, 'is': 1, 'he': 1, 'was': 1})}
Upvotes: 0
Reputation: 678
You can merge your sublists to get your d2: d2 = {k: reduce(list.__add__, d[k], []) for k in d}
.
In python3, you will need to from functools import reduce
Upvotes: 0
Reputation: 36043
To combine the two solutions, just replace Counter(v)
from your first solution with the second solution.
from collections import Counter
d = {'a': [['I', 'said', 'that'], ['said', 'I']],
'b': [['she', 'is'], ['he', 'was']]}
counts = {k: Counter(word
for sublist in lst
for word in sublist)
for k, lst in d.items()}
print(counts)
Output:
{'a': Counter({'I': 2, 'said': 2, 'that': 1}),
'b': Counter({'she': 1, 'is': 1, 'he': 1, 'was': 1})}
Upvotes: 2