J Zhao
J Zhao

Reputation: 5

Beginner in R , try hist function with vector

I have tried the function hist(y) for y=c(1,2,3,4,5) but the frequency of 1 is 2.0 ,why?

Upvotes: 1

Views: 77

Answers (3)

NelsonGon
NelsonGon

Reputation: 13309

This is not a mistake. The histogram is showing the frequency of values within the range 1 to 2. By default, hist sets right to TRUE which means the intervals are right closed. Change that to FALSE and you will have a histogram that is left closed hence turning the frequency of 4-5 to 2. For more details, please see help(hist).

Excerpt from the docs:

The definition of histogram differs by source (with country-specific biases). R's default with equi-spaced breaks (also the default) is to plot the counts in the cells defined by breaks. Thus the height of a rectangle is proportional to the number of points falling into the cell, as is the area provided the breaks are equally-spaced.

The default with non-equi-spaced breaks is to give a plot of area one, in which the area of the rectangles is the fraction of the data points falling in the cells.

Scenario 1:

hist(1:5)

enter image description here

Scenario 2:

hist(1:5, right = FALSE)

*enter image description here*

Upvotes: 4

Iman
Iman

Reputation: 2324

You may need to look at bin in a histogram plot. a histogram is different from a fequency barplot.
you can try :

barplot(table(y))

enter image description here

Upvotes: 2

G5W
G5W

Reputation: 37641

Notice that you only have four boxes. The first box is indicating the number of 1's and 2's. You can get something more like what you were expecting by specifying the breakpoints.

hist(y, breaks=seq(0.5,5.5,1))

Histogram with 5 boxes

Upvotes: 2

Related Questions