Reputation: 209
I used Counter from collections to generate a large list of Counters, i.e List = [Counter({'A': 4}), Counter({'A': 2}), Counter({'A': 4}, {'B', 3})...].
I would like to build histogram from that list, where each bin of that histogram would be one specific type of counter, i.e having the same number of each element counted by the counter.
Here is an example
from collections import Counter
data = [['A', 'B', 'B'], ['A', 'B', 'B'], ['A', 'B', 'B'], ['C']]
data_counters = []
for d in data:
data_counters.append(Counter(d))
I could generate a list of all possible counter outcomes and then count how many times each counter occurs in data_counters. But this is difficult since the number of all possible outcomes is large.
Strictly speaking this problem is just a calculation of a multi-dimensional histogram, where each dimension corresponds to a letter. But the point is that I want to avoid that and only use the bins of the combinations of letters that occur many times, without looking at detailed information.
Upvotes: 1
Views: 197
Reputation: 12410
Counter objects are not hashable, so we cannot use the same counter function on them again. One way to deal with this situation is to use frozenset
instead of Counter dictionaries. The disadvantage is that we have to re-translate the frozensets into something like strings that matplotlib can use for the bar chart, aka the histogram:
from collections import Counter
from matplotlib import pyplot as plt
data_list = [['A', 'B', 'B'], ['B', 'A', 'B'], ['A', 'A', 'B'], ['C']]
data_counter = Counter(frozenset(Counter(d).items()) for d in data_list)
plt.xticks(rotation=45, ha="right")
plt.bar([str(list(x)) for x in data_counter.keys()], data_counter.values())
plt.tight_layout()
plt.show()
Upvotes: 1