Reputation: 131
Suppose I have
set.seed(2020) # make the results reproducible
a <- rnorm(100, 0, 1)
My probability density is estimated through kernel density estimator (gaussian) in R using the R built in function density
. The question is how to integrate the square of the estimated density. It does not matter between which values, let us suppose between -Inf
and +Inf
. I have tried the following:
f <- approxfun(density(a)$x, density(a)$y)
integrate (f*f, min(density(a)$x), max(density(a)$x))
Upvotes: 3
Views: 383
Reputation: 173793
There are a couple of problems here. First you have the x
and y
round the wrong way in approxfun
. Secondly, you can't multiply function names together. You need to specify a new function that gives you the square of your original function:
set.seed(2020)
a <- rnorm(100, 0, 1)
f <- approxfun(density(a)$x, density(a)$y)
f2 <- function(v) ifelse(is.na(f(v)), 0, f(v)^2)
integrate (f2, -Inf, Inf)
#> 0.2591153 with absolute error < 0.00011
We can also plot the original density function and the squared density function:
curve(f, -3, 3)
curve(f2, -3, 3, add = TRUE, col = "red")
Upvotes: 3
Reputation: 76402
Here is a way using package caTools
, function trapz
. It computes the integral given a vector x
and its corresponding image y
using the trapezoidal rule.
I also include a function trapzf
based on the original to have the integral computed with the function returned by approxfun
trapzf <- function(x, FUN) trapz(x, FUN(x))
set.seed(2020) # make the results reproducible
a <- rnorm(100, 0, 1)
d <- density(a)
f <- approxfun(d$x, d$y)
int1 <- trapz(d$x, d$y^2)
int2 <- trapzf(d$x, function(x) f(x)^2)
int1
#[1] 0.2591226
identical(int1, int2)
#[1] TRUE
Upvotes: 2
Reputation: 101064
I think you should write the objective function as function(x) f(x)**2
, rather than f*f
, e.g.,
> integrate (function(x) f(x)**2, min(density(a)$x), max(density(a)$x))
0.2331793 with absolute error < 6.6e-06
Upvotes: 2