YoussefMabrouk
YoussefMabrouk

Reputation: 209

Build histogram from objects of type counter

I used Counter from collections to generate a large list of Counters, i.e List = [Counter({'A': 4}), Counter({'A': 2}), Counter({'A': 4}, {'B', 3})...].

I would like to build histogram from that list, where each bin of that histogram would be one specific type of counter, i.e having the same number of each element counted by the counter.

Here is an example

from collections import Counter
data = [['A', 'B', 'B'], ['A', 'B', 'B'], ['A', 'B', 'B'], ['C']]
data_counters = []
for d in data:
    data_counters.append(Counter(d))

I could generate a list of all possible counter outcomes and then count how many times each counter occurs in data_counters. But this is difficult since the number of all possible outcomes is large.

Strictly speaking this problem is just a calculation of a multi-dimensional histogram, where each dimension corresponds to a letter. But the point is that I want to avoid that and only use the bins of the combinations of letters that occur many times, without looking at detailed information.

Upvotes: 1

Views: 197

Answers (1)

Mr. T
Mr. T

Reputation: 12410

Counter objects are not hashable, so we cannot use the same counter function on them again. One way to deal with this situation is to use frozenset instead of Counter dictionaries. The disadvantage is that we have to re-translate the frozensets into something like strings that matplotlib can use for the bar chart, aka the histogram:

from collections import Counter
from matplotlib import pyplot as plt

data_list = [['A', 'B', 'B'], ['B', 'A', 'B'], ['A', 'A', 'B'], ['C']]
data_counter = Counter(frozenset(Counter(d).items()) for d in data_list)

plt.xticks(rotation=45, ha="right")
plt.bar([str(list(x)) for x in data_counter.keys()], data_counter.values())
plt.tight_layout()

plt.show()

Sample output: enter image description here

Upvotes: 1

Related Questions