Reputation: 23
I would like to program a kernel estimate (with Epanechnikov kernel^1 for example). I tried the following code^2 by putting the manual code (blue) and the default code (red) on the same figure (see attached) but it always gives a difference between the two density curves!
1: The analytic form of the Epanechnikov kernel is: kappa(u) = (1-u^2), support |u| <=1, with u = (x-x_{i})/h.
2: My trial code:
x= faithful$eruptions
fit2 <- density(x, bw = 0.6, kernel = "epanechnikov")
xgrid = seq(-1, 8, 0.1)
kernelEpan <- function(x, obs, h) sum((1-((x-obs)/h)^2)*(abs(x-obs)<=h))/h
plot(xgrid, sapply(xgrid, FUN = kernelEpan, obs = faithful$eruptions, h = 0.6)/length(faithful$eruptions), type = "l", col = "blue")
lines(fit2, col = "red")
Upvotes: 2
Views: 1341
Reputation: 173793
If you read the docs for bw
in the density
function, you will see:
bw : the smoothing bandwidth to be used. The kernels are scaled such that this is the standard deviation of the smoothing kernel.
Which means that in order for your function's h
parameter to match the behaviour of the bw
parameter, you will need to rescale the h
parameter by multiplying it by sqrt(5)
.
I would be tempted to vectorize your function, which allows you to normalize it accurately too:
kernelEpan <- function(xvals, obs, h) {
h <- h * sqrt(5)
dens <- sapply(xvals, function(x) {
u <- abs(x - obs) / h
u <- ifelse(u > 1, 1, u)
sum(1 - u^2)
})
dens / sum(dens * mean(diff(xvals)))
}
This allows:
fit1 <- kernelEpan(xgrid, obs = faithful$eruptions, h = 0.6)
fit2 <- density(x, bw = 0.6, kernel = "epanechnikov")
plot(xgrid, fit1, type = "l", col = "blue")
lines(fit2, col = "red")
Upvotes: 3