Reputation: 111
I am trying to get a curve on top of a histogram; however, the curve somehow started at y=0 and x at some negative value, but it needs to begin at x=0 where it has the highest frequency.
These are the values of data
:
[1] 0.41645505 0.17807010 0.04401494 0.00000000 0.53424325 0.00000000 0.78833026 0.14429310 0.00000000 0.35345068 0.00000000 0.00000000
[13] 0.03157549 0.00000000 0.00000000 0.83979615 0.15510495 0.00000000 0.00000000 0.38146542 0.60273251 0.28437203 0.00000000 0.00000000
[25] 0.63672858 0.00000000 0.28479730 0.00000000 0.73017781 0.39795789 0.00000000 0.00000000 0.56448031 0.00000000 0.92790850 0.00000000
[37] 0.00000000 0.46136357 0.27828194 0.00000000 0.01385383 0.36895497 0.06200592 0.00000000 0.17517336 0.57521911 0.00000000 0.32508820
[49] 0.00000000 0.00000000
hist(data)
The histogram that is produced is fine. However, when I tried to plot a curve on top:
plot(density(data))
it produced a plot which started from (-0.2, 0), but there is no value in data which is negative.
I want a curve/line on the top of the bars in the histogram.
Upvotes: 0
Views: 172
Reputation: 226182
tl;dr use from=0
in your density statement to restrict the range. (Don't forget to use freq=FALSE
or prob=TRUE
in your histogram to scale the histogram to densities rather than counts.)
Data:
dat <- c(0.41645505,0.17807010,0.04401494,0.00000000, 0.53424325,
0.00000000,0.78833026,0.14429310,0.00000000,0.35345068,
0.00000000,0.00000000,0.03157549,0.00000000,0.00000000,
0.83979615,0.15510495,0.00000000,0.00000000,0.38146542,
0.60273251,0.28437203,0.00000000,0.00000000,0.63672858,
0.00000000,0.28479730,0.00000000,0.73017781,0.39795789,
0.00000000,0.00000000,0.56448031,0.00000000,0.92790850,
0.00000000,0.00000000,0.46136357,0.27828194,0.00000000,
0.01385383,0.36895497,0.06200592,0.00000000,0.17517336,
0.57521911,0.00000000,0.32508820,0.00000000,0.00000000)
Using from=0
in density()
tells R to start the output from 0. If you want a wigglier, less-smooth line, you can lower the adjust
argument to density()
. @RuiBarradas's answer shows you how to put a smooth line through the midpoints of the tops of the histogram bars - although arguably this doesn't make much theoretical sense as a way to characterize the density.
par(las=1)
hist(dat,freq=FALSE,col="gray", main="")
lines(density(dat, from=0),col=2,lwd=2)
lines(density(dat, from=0, adjust=0.25),col=4,lwd=2)
Upvotes: 4
Reputation: 1378
Using lattice
you can find and visualize the distribution within each bin:
dat <- c(0.41645505,0.17807010,0.04401494,0.00000000, 0.53424325,
0.00000000,0.78833026,0.14429310,0.00000000,0.35345068,
0.00000000,0.00000000,0.03157549,0.00000000,0.00000000,
0.83979615,0.15510495,0.00000000,0.00000000,0.38146542,
0.60273251,0.28437203,0.00000000,0.00000000,0.63672858,
0.00000000,0.28479730,0.00000000,0.73017781,0.39795789,
0.00000000,0.00000000,0.56448031,0.00000000,0.92790850,
0.00000000,0.00000000,0.46136357,0.27828194,0.00000000,
0.01385383,0.36895497,0.06200592,0.00000000,0.17517336,
0.57521911,0.00000000,0.32508820,0.00000000,0.00000000)
dat.hist <- hist(dat, breaks =6, border = "white", col="gray",main = "")
plot(dat.hist)
library(lattice)
lattice::densityplot( ~ dat | cut(dat, breaks = dat.hist$breaks),
layout = c(5, 1))
Upvotes: 0