MFS
MFS

Reputation: 63

Using "+=" in array of Counters resulted in an unexpected behavior, Python

I am using Python 3.4.1 and I was wondering about the following situation:

Given an array of counters

cnt = [Counter()] * n

I want to add some items in a specific position, so I do

cnt[i] += Counter(x)

For the construction "+=", I was trying to do

cnt[i] = cnt[i] + Counter(x)

But, instead of what I expected, I received something equivalent to

for i in range(0, n):
    cnt[i] = cnt[i] + Counter(x)

In other words, it added all my counters in the array.

An short example:

from collections import Counter

text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
cnt = [Counter()] * 2

i = 0
for c in text:
    cnt[i] += Counter(c)  # cnt[i] = cnt[i] + Counter(c)
    i = (i+1) % 2

for i in range(0, 2):
    print(cnt[i], i)

Output:

Counter({' ': 7, 'i': 6, 'e': 5, 't': 5, 'o': 4, 's': 4, 'm': 3, 'r': 3, 'c': 3, 'u': 2, 'p': 2, 'a': 2, 'l': 2, 'd': 2, 'n': 2, '.': 1, 'g': 1, 'L': 1, ',': 1}) 0
Counter({' ': 7, 'i': 6, 'e': 5, 't': 5, 'o': 4, 's': 4, 'm': 3, 'r': 3, 'c': 3, 'u': 2, 'p': 2, 'a': 2, 'l': 2, 'd': 2, 'n': 2, '.': 1, 'g': 1, 'L': 1, ',': 1}) 1

Expected output:

Counter({'t': 4, 'i': 3, 'r': 3, 's': 2, 'e': 2, 'm': 2, 'c': 2, 'n': 2, 'a': 2, 'l': 2, ',': 1, 'd': 1, ' ': 1, 'L': 1}) 0
Counter({' ': 6, 'o': 4, 'i': 3, 'e': 3, 's': 2, 'u': 2, 'p': 2, '.': 1, 't': 1, 'g': 1, 'd': 1, 'm': 1, 'c': 1}) 1

Upvotes: 1

Views: 725

Answers (1)

Michael0x2a
Michael0x2a

Reputation: 63998

When you do cnt = [Counter()] * n, what you're doing is creating a single counter, then making every element in your list point to that counter. You're not creating n Counters, you're creating a single one.

This is because in Python, everything is stored by reference (sort of. More info here). You've essentially duplicated the reference to the counter object n times, not the counter itself.

That means that doing cnt[i] += Counter(x) will modify the underlying counter, making it appear like the entire list changed.

To fix this, try doing something like the following:

cnt = [Counter() for i in range(n)]

Now, you're genuinely creating n different counters (because you call the constructor n times) and will get the expected behavior.

Upvotes: 1

Related Questions