Reputation: 51
I have a column in a pandas dataframe that has three possible categorical values. When I try to plot it using plt.hist(data['column'])
from matplotlib, the histogram bars are not aligned with the x-axis ticks, and they're not evenly spaced. How can I fix this?
Upvotes: 5
Views: 23226
Reputation: 3670
Histograms are used to plot the frequency distribution of numerical variables (continuous or discrete). The frequency distribution of categorical variables is best displayed with bar charts. For this, you first need to compute the frequency of each category with value_counts
and then you can conveniently plot that directly with pandas plot.bar
. Or else with matplotlib if you prefer, as shown below.
import numpy as np # v 1.19.2
import pandas as pd # v 1.2.3
import matplotlib.pyplot as plt # v 3.3.4
data = pd.DataFrame(dict(column=np.repeat(['F', 'M', '--'], [11000, 13000, 3000])))
data['column'].value_counts().plot.bar(rot=0)
categories = data['column'].value_counts().index
counts = data['column'].value_counts().values
plt.bar(categories, counts, width=0.5)
Upvotes: 9
Reputation: 25100
You have to tell Matplotlib how many bins are needed (by default, 10 bins are ALWAYS used in a hist
), and you have to specify the position of the labels, taking into account that the x-axis run from 0 to the number of bins minus one (in my example below, from 0 to 2)
from numpy import array, linspace ; from numpy.random import randint
from matplotlib.pyplot import hist, xticks, show
# synthesize some data
x = array([{0:'A',1:'B',2:'B',3:'B',4:'C',5:'C'}[n] for n in randint(0, 6, 20000)])
nc = len(set(x)) # how many categories
hist(x, bins=nc, rwidth=0.7)
xticks(linspace(0, nc-1, 2*nc+1)[1::2])
show()
Upvotes: 3
Reputation: 94
The parameter rwidth
specifies the width of your bar relative to the width of your bin. For example, if your bin width is said 1 and rwidth=0.5, the bar width will be 0.5. On both sides of the bar, you will have a space of 0.25.
Mind: this gives a space of 0.5 between consecutive bars. With the number of bins you have, you won't see these spaces. But with fewer bins, they do show up.
Upvotes: 0