ArMonk
ArMonk

Reputation: 366

matplotlib a histogram with strings on x-axis

I want to do a histogram on a very basic pandas series. For example below, I simply want the x-axis to display "ice-cream", "chocolate", and "coffee", and that the y-axis display 2, 3, 1 (the count). Is this possible? Notice the first column is not in sequential order because I have filtered out NaN values.

print(data_null_false)
45    ice-cream
101   chocolate
102   ice-cream
103   coffee
112   chocolate
120   chocolate

fig, ax = plt.subplots()
ax.hist(rbr_null_false)
plt.show()

Resulted the following errors:

    ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-7d1a5e1bb62b> in <module>()
     28 
     29 fig, ax = plt.subplots()
---> 30 ax.hist(rbr_null_false)
     31 #plt.xlabel('index', fontsize=12);
     32 #plt.ylabel('prod_rollback_date', fontsize=12);

~/anaconda3/lib/python3.5/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1810                     warnings.warn(msg % (label_namer, func.__name__),
   1811                                   RuntimeWarning, stacklevel=2)
-> 1812             return func(ax, *args, **kwargs)
   1813         pre_doc = inner.__doc__
   1814         if pre_doc is None:

~/anaconda3/lib/python3.5/site-packages/matplotlib/axes/_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   5993             xmax = -np.inf
   5994             for xi in x:
-> 5995                 if len(xi) > 0:
   5996                     xmin = min(xmin, xi.min())
   5997                     xmax = max(xmax, xi.max())

TypeError: len() of unsized object

Upvotes: 1

Views: 5631

Answers (1)

Y. Luo
Y. Luo

Reputation: 5732

Though you said you want a histogram, it's actually a bar plot. "A histogram is an accurate graphical representation of the distribution of numerical data." Your example is categorical data. So:

import io

import matplotlib.pyplot as plt
import pandas as pd

data = """45    ice-cream
101 chocolate
102 ice-cream
103 coffee
112 chocolate
120 chocolate"""
df = pd.read_table(io.StringIO(data), header=None)
s = df[1]

s.value_counts().plot(kind='bar')
plt.show()

enter image description here

Upvotes: 4

Related Questions