Reputation: 23919
I've tried different functions and several different arguments but the histogram
hist(estimator, probability=T, br=5)
isn't showing densities on the y-axis no matter what plotting function or argument I choose. The vector "estimator" contains 100 values around between 0.4 and 0.6.
To be precise: the way by creating an hist object, then calculate and change the densities and plot it again by plot() works, but I don't want the hist object to be plotted in the first place :/
Upvotes: 2
Views: 7434
Reputation: 49670
When you specify probability=T
(or better yet probability=TRUE
so that you don't get messed up if T
is changed to something besides TRUE
) is a scaling such that the entire area of the histogram bars add to 1, since the width of your bars is quite a bit less than 1 the heights need to be greater than 1 so that the areas all add to 1. This makes it easy to superpose a density estimate curve or a theoretical density curve or add other references.
In general you should just ignore the tick labels on the y-axis (it would be better if they were not even plotted), they just distract from the important parts of the plot.
Many people think they want the y-axis tick labels to represent the proportion (or percentage) of observations within each grouping (and that is possible with your own custom axis), but I think this is still a distraction. Consider what happens if you change the number of bars/intervals in the histogram, the overall structure of the histogram stays the same (provided you don't make to drastic a change), but the tick labels on the y-axis change, sometimes by quite a bit, so they are better ignored (or not produced in the first place).
If you really think that the percentages (or proportions) are needed then the code is as simple as:
x <- rgamma(327, 5, 3)
tmp <- hist(x, yaxt='n',ylab='Percent')
tmp2 <- pretty( tmp$counts/sum(tmp$counts)*100 )
axis(2, at=tmp2*sum(tmp$counts)/100, labels=tmp2)
That could be easily wrapped into a function if you wanted.
Upvotes: 8