Reputation: 366
I want to do a histogram on a very basic pandas series. For example below, I simply want the x-axis to display "ice-cream", "chocolate", and "coffee", and that the y-axis display 2, 3, 1 (the count). Is this possible? Notice the first column is not in sequential order because I have filtered out NaN values.
print(data_null_false)
45 ice-cream
101 chocolate
102 ice-cream
103 coffee
112 chocolate
120 chocolate
fig, ax = plt.subplots()
ax.hist(rbr_null_false)
plt.show()
Resulted the following errors:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-28-7d1a5e1bb62b> in <module>()
28
29 fig, ax = plt.subplots()
---> 30 ax.hist(rbr_null_false)
31 #plt.xlabel('index', fontsize=12);
32 #plt.ylabel('prod_rollback_date', fontsize=12);
~/anaconda3/lib/python3.5/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
1810 warnings.warn(msg % (label_namer, func.__name__),
1811 RuntimeWarning, stacklevel=2)
-> 1812 return func(ax, *args, **kwargs)
1813 pre_doc = inner.__doc__
1814 if pre_doc is None:
~/anaconda3/lib/python3.5/site-packages/matplotlib/axes/_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
5993 xmax = -np.inf
5994 for xi in x:
-> 5995 if len(xi) > 0:
5996 xmin = min(xmin, xi.min())
5997 xmax = max(xmax, xi.max())
TypeError: len() of unsized object
Upvotes: 1
Views: 5631
Reputation: 5732
Though you said you want a histogram, it's actually a bar plot. "A histogram is an accurate graphical representation of the distribution of numerical data." Your example is categorical data. So:
import io
import matplotlib.pyplot as plt
import pandas as pd
data = """45 ice-cream
101 chocolate
102 ice-cream
103 coffee
112 chocolate
120 chocolate"""
df = pd.read_table(io.StringIO(data), header=None)
s = df[1]
s.value_counts().plot(kind='bar')
plt.show()
Upvotes: 4