Reputation: 567
I have a list of lists:
my_list= [['UV'],
['SB'],
['NMR'],
['ISSN'],
['UK', 'USA'],
['MT'],
['UK'],
['UK'],
['ESP'],
['UK'],
['UK'],
['UK'],
['UK'],
['UK'],
['UK']]
that I would like to plot in terms of frequency (from the most frequent term to the less frequent).
I am finding some issue in counting the items. What I first did is to flatten the list of lists:
flattened = []
for sublist in my_list:
for val in sublist:
flattened.append(val)
Then I tried to count items it
from collections import Counter
import pandas as pd
counts = Counter(flattened)
df_ver = pd.DataFrame.from_dict(counts, orient='index')
df_ver.plot(kind='bar')
However it does not work. Also it should be not sorted, I guess.
Upvotes: 1
Views: 862
Reputation: 25189
Let's try with pure Python:
counts = {}
for countries in my_list:
for country in countries:
counts[country] = counts.get(country,0) +1
sorted_counts = sorted(counts.items(), key=lambda i: (-i[1],i[0])) # sort by count and alphabetically if draw
# ktop = 10
# sorted_counts = sorted_counts[:ktop]
countries, counts = list(zip(*sorted_counts))
plt.bar(countries, counts);
Upvotes: 1
Reputation: 12410
Since you use Counter
:
from matplotlib import pyplot as plt
from collections import Counter
from itertools import chain
my_list= [['UV'],
['SB'],
['NMR'],
['ISSN'],
['UK', 'USA'],
['MT'],
['UK'],
['UK'],
['ESP'],
['UK'],
['UK'],
['UK'],
['UK'],
['UK'],
['UK']]
counts = Counter(chain.from_iterable(my_list))
plt.bar(*zip(*counts.most_common()))
plt.show()
Upvotes: 1
Reputation: 4761
Another option:
df_ver = df_ver.sort_values(0, ascending = False)
df_ver.plot(kind = "bar", legend = False)
Upvotes: 1