Reputation: 157
I have seen examples on how to count items in dictionary or list. My dictionary stored multiple lists. Each list stores multiple items.
d = dict{}
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
1. I want to count frequency of each alphabet, i.e. the results should be
A - 4
B - 1
C - 2
D - 1
E - 1
F - 1
2. I want to have group by each alphabet, i.e. the results should be
A - text1, text2, text4, text5
B - text4
C - text1, text3
D - text3
E - text1
F - text1
How can I achieve both by using some Python existing libraries without using many for loops?
Upvotes: 1
Views: 5307
Reputation: 1724
There are a few ways to accomplish this, but if you'd like to handle things without worrying about import
ing additional modules or installing and importing external modules, this method will work cleanly 'out of the box.'
With d
as your starting dictionary:
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
create a new dict
, called letters
, for your results to live in, and populate it with your letters, taken from d.keys()
, by creating the letter key if it isn't present, and creating a list with the count and the key from d
as it's value. If it's already there, increment the count, and append the current key from d
to it's d
key list in the value.
letters = {}
for item in d.keys():
for letter in d[item]:
if letter not in letters.keys():
letters[letter] = [1,[item]]
else:
letters[letter][0] += 1
letters[letter][1] += [item]
This leaves you with a dict
called letters
containing values of the counts and the keys from d
that contain the letter, like this:
{'E': [1, ['text1']], 'C': [2, ['text3', 'text1']], 'F': [1, ['text1']], 'A': [4, ['text2', 'text4', 'text1', 'text5']], 'B': [1, ['text4']], 'D': [1, ['text3']]}`
Now, to print your first list, do:
for letter in sorted(letters):
print(letter, letters[letter][0])
printing each letter and the contents of the first, or 'count' index of the list as its value, and using the built-in sorted()
function to put things in order.
To print the second, likewise sorted()
, do the same, but with the second, or 'key', index of the list in its value, .joined
using a ,
into a string:
for letter in sorted(letters):
print(letter, ', '.join(letters[letter][1]))
To ease Copy/Paste, here's the code unbroken by my ramblings:
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
letters = {}
for item in d.keys():
for letter in d[item]:
if letter not in letters.keys():
letters[letter] = [1,[item]]
else:
letters[letter][0] += 1
letters[letter][1] += [item]
print(letters)
for letter in letters:
print(letter, letters[letter][0])
print()
for letter in letters:
print(letter, ', '.join(letters[letter][1]))
Hope this helps!
Upvotes: 2
Reputation: 46921
from collections import defaultdict
alphabets = defaultdict(list)
his is a way to acheive this:
for text, letters in d.items():
for letter in letters:
alphabets[letter].append(text)
for letter, texts in sorted(alphabets.items()):
print(letter, texts)
for letter, texts in sorted(alphabets.items()):
print(letter, len(texts))
note that if you have A - text1, text2, text4, text5
to get to A - 4
is just a matter of counting the texts.
Upvotes: 0
Reputation: 5968
For your first task:
from collections import Counter
d = {
'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']
}
occurrences = Counter(''.join(''.join(values) for values in d.values()))
print(sorted(occurrences.items(), key=lambda l: l[0]))
Now let me explain it:
As I saw, you already have the solution for your second problem.
Upvotes: 0
Reputation: 12178
from collections import Counter, defaultdict
from itertools import chain
d = {'text1': ['A', 'C', 'E', 'F'],
'text2': ['A'],
'text3': ['C', 'D'],
'text4': ['A', 'B'],
'text5': ['A']}
counter = Counter(chain.from_iterable(d.values()))
group = defaultdict(list)
for k, v in d.items():
for i in v:
group[i].append(k)
out:
Counter({'A': 4, 'B': 1, 'C': 2, 'D': 1, 'E': 1, 'F': 1})
defaultdict(list,
{'A': ['text2', 'text4', 'text1', 'text5'],
'B': ['text4'],
'C': ['text1', 'text3'],
'D': ['text3'],
'E': ['text1'],
'F': ['text1']})
Upvotes: 0
Reputation: 44484
To get to (2), you would have to first invert the keys and values of a dictionary, and store them in a list. Once you are there, use groupby
with a key to get to the structure of (2).
from itertools import groupby
arr = [(x,t) for t, a in d.items() for x in a]
# [('A', 'text2'), ('C', 'text3'), ('D', 'text3'), ('A', 'text1'), ('C', 'text1'), ('E', 'text1'), ('F', 'text1'), ('A', 'text4'), ('B', 'text4'), ('A', 'text5')]
res = {g: [x[1] for x in items] for g, items in groupby(sorted(arr), key=lambda x: x[0])}
#{'A': ['text1', 'text2', 'text4', 'text5'], 'C': ['text1', 'text3'], 'B': ['text4'], 'E': ['text1'], 'D': ['text3'], 'F': ['text1']}
res2 = {x: len(y) for x, y in res.items()}
#{'A': 4, 'C': 2, 'B': 1, 'E': 1, 'D': 1, 'F': 1}
PS: I am hoping you'd meaningful variable names in your real code.
Upvotes: 4