Reputation: 13
Attempting to understand dicts in a somewhat simple way I believe, but am thoroughly confused. When I run the code below, it works fine to populate dict and count the occurrences of a word. Output is as such: {'gerry': 2, 'babona': 1, 'cheese': 1, 'cherry': 1}
dict = {}
a = ['gerry', 'babona', 'cheese', 'gerry', 'cherry']
b = ['O' ,'O', 'T', 'T', 'T']
for (i,j) in zip(a,b):
if i not in dict:
dict[i] = 1
else:
dict[i] += 1
However, if I try to run the following code, there is a KeyError: 'gerry'
, beginning with the first value in the list, but I cannot make sense of why. Any help on this greatly appreciated!
dict = {}
a = ['gerry', 'babona', 'cheese', 'gerry', 'cherry']
b = ['O' ,'O', 'T', 'T', 'T']
for (i,j) in zip(a,b):
if i not in dict:
dict[i][j] = 1
else:
dict[i][j] += 1
Upvotes: 0
Views: 74
Reputation: 5347
from collections import Counter
a = ['gerry', 'babona', 'cheese', 'gerry', 'cherry']
b = ['O' ,'O', 'T', 'T', 'T']
print(Counter(a))
Counter({'gerry': 2, 'babona': 1, 'cheese': 1, 'cherry': 1})
for the 2nd case you can use groupby
dct = dict((key, tuple(v for (k, v) in pairs))
for (key, pairs) in groupby(sorted(zip(a,b)), lambda pair: pair[0]))
{k:Counter(v) for k,v in dct.items()}
{'babona': Counter({'O': 1}), 'cheese': Counter({'T': 1}),'cherry': Counter({'T': 1}), 'gerry': Counter({'O': 1, 'T': 1})}
Upvotes: 0
Reputation: 243
The first example can be simpler
from collections import Counter
a = ['gerry', 'babona', 'cheese', 'gerry', 'cherry']
data = Counter(a)
For second
from collections import defaultdict
a = ['gerry', 'babona', 'cheese', 'gerry', 'cherry']
b = ['O', 'O', 'T', 'T', 'T']
data = defaultdict(lambda: defaultdict(int))
for (i, j) in zip(a, b):
data[i][j] += 1
btw don't use reserved name for your variable eg dict
Upvotes: 3
Reputation: 39414
When the program gets to: dict[i][j] = 1
it has to execute: dict[i]
first, which is exactly what you were trying to avoid when you wrote the first snippet.
You will have to do multi-stage tests to get this to work.
You can get it to work like this, but there are simpler ways:
dct = {}
a = ['gerry', 'babona', 'cheese', 'gerry', 'cherry']
b = ['O' ,'O', 'T', 'T', 'T']
for (i,j) in zip(a,b):
if i not in dct:
dct[i] = {}
dct[i][j] = 1
else:
di = dct[i]
if j not in di:
di[j] = 0
dct[i][j] += 1
print(dct)
Output:
{'gerry': {'O': 1, 'T': 1}, 'babona': {'O': 1}, 'cheese': {'T': 1}, 'cherry': {'T': 1}}
Upvotes: 2