Reputation: 283
I need to create a histogram from a dataframe column that contains the values "Low', 'Medium', or 'High'. When I try to do the usual df.column.hist(), i get the following error.
ex3.Severity.value_counts()
Out[85]:
Low 230
Medium 21
High 16
dtype: int64
ex3.Severity.hist()
TypeError Traceback (most recent call last)
<ipython-input-86-7c7023aec2e2> in <module>()
----> 1 ex3.Severity.hist()
C:\Users\C06025A\Anaconda\lib\site-packages\pandas\tools\plotting.py in hist_series(self, by, ax, grid, xlabelsize, xrot, ylabelsize, yrot, figsize, bins, **kwds)
2570 values = self.dropna().values
2571
->2572 ax.hist(values, bins=bins, **kwds)
2573 ax.grid(grid)
2574 axes = np.array([ax])
C:\Users\C06025A\Anaconda\lib\site-packages\matplotlib\axes\_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
5620 for xi in x:
5621 if len(xi) > 0:
->5622 xmin = min(xmin, xi.min())
5623 xmax = max(xmax, xi.max())
5624 bin_range = (xmin, xmax)
TypeError: unorderable types: str() < float()
Upvotes: 27
Views: 60003
Reputation: 490
Just an updated answer (as this comes up a lot.) Pandas has a nice module for styling dataframes in many ways, such as the case mentioned above....
ex3.Severity.value_counts().to_frame().style.bar()
...will print the dataframe with bars built-in (as sparklines, using excel-terminology). Nice for quick analysis on jupyter notebooks.
Upvotes: 11
Reputation: 776
ex3.Severity.value_counts().plot(kind='bar')
Is what you actually want.
When you do:
ex3.Severity.value_counts().hist()
it gets the axes the wrong way round i.e. it tries to partition your y axis (counts) into bins, and then plots the number of string labels in each bin.
Upvotes: 63
Reputation: 394041
You assumed that because your data was composed of strings that calling plot()
on this would automatically perform the value_counts()
but this is not the case hence the error, all you needed to do was:
ex3.Severity.value_counts().hist()
Upvotes: 5
Reputation: 9798
It is a matplotlib issue which cannot order string together, however you can achieve the desired result by labeling the x-ticks:
# emulate your ex3.Severity.value_counts()
data = {'Low': 2, 'Medium': 4, 'High': 5}
df = pd.Series(data)
plt.bar(range(len(df)), df.values, align='center')
plt.xticks(range(len(df)), df.index.values, size='small')
plt.show()
Upvotes: 8