Reputation: 125
I know how to plot a histogram when individual datapoints are given like: (33, 45, 54, 33, 21, 29, 15, ...)
by simply using something matplotlib.pyplot.hist(x, bins=10)
but what if I only have grouped data like:
I know that I can use bar plots to mimic a histogram by changing xticks
but what if I want to do this by using only hist
function of matplotlib.pyplot
?
Is it possible to do this?
Upvotes: 0
Views: 7599
Reputation: 41327
You can build the hist()
params manually and use the existing value counts as weights
.
Say you have this df
:
>>> df = pd.DataFrame({'Marks': ['0-10', '10-20', '20-30', '30-40'], 'Number of students': [8, 12, 24, 26]})
Marks Number of students
0 0-10 8
1 10-20 12
2 20-30 24
3 30-40 26
The bins
are all the unique boundary values in Marks
:
>>> bins = pd.unique(df.Marks.str.split('-', expand=True).astype(int).values.ravel())
array([ 0, 10, 20, 30, 40])
Choose one x
value per bin, e.g. the left edge to make it easy:
>>> x = bins[:-1]
array([ 0, 10, 20, 30])
Use the existing value counts (Number of students
) as weights
:
>>> weights = df['Number of students'].values
array([ 8, 12, 24, 26])
Then plug these into hist()
:
>>> plt.hist(x=x, bins=bins, weights=weights)
Upvotes: 2
Reputation: 104
One possibility is to “ungroup” data yourself.
For example, for the 8 students with a mark between 0 and 10, you can generate 8 data points of value of 5 (the mean). For the 12 with a mark between 10 and 20, you can generate 12 data points of value 15.
However, the “ungrouped” data will only be an approximation of the real data. Thus, it is probably better to just use a matplotlib.pyplot.bar
plot.
Upvotes: 0