Reputation: 663
Suppose I have a function, something like:
fun <- function(x) 2.3*exp(-x/2)
and I want to get the mean value of this function along an interval, suppose from 2 to 20.
to get the mean, the first that comes to my mind is this:
mean(fun(2:20))
so as simple as start giving values to the function and computing the mean.
However I wonder if is there any other way more precise to obtain this. Any idea?
Upvotes: 0
Views: 335
Reputation: 60964
Analytically, you can determine the mean value of a function on the interval [a,b] using:
So, after taking the integral, you can evaluate the function at two points and get the mean value analytically. In your case this leads to an integral of -4.6 * exp(0.5 * x)
, and a mean value of 1/(20-2) * (-4.6 * exp(-0.5 * 20) + 4.6 * exp(-0.5 * 2)) = 0.09400203
.
Now I focus now on sampling along the interval, and calculating the mean like that:
get_sample_mean_from_function = function(func, interval, n = 1000) {
interval_samples = seq(interval[1], interval[2], length = n)
function_values = sapply(interval_samples, func)
return(mean(function_values))
}
fun <- function(x) 2.3*exp(-x/2)
get_sample_mean_from_function(fun, interval = c(2,20))
By increasing the number n
(number of samples taken) you can increase the precision of your answer. This is how the mean value develops with increasing sample size:
n_list = c(1,4,10,15,25,50,100,500,1000,10e3,100e3,100e4,100e5)
mean_list = sapply(n_list,
function(x) get_sample_mean_from_function(fun,
interval = c(2,20), n = x))
library(ggplot2)
qplot(n_list, mean_list, geom = "point", log = "x")
Notice that it takes at least 1000 samples to get any convergence. If we compare this numerical solution with the analytical value:
mean_list - real_value
[1] 7.521207e-01 1.286106e-01 3.984653e-02 2.494165e-02 1.421951e-02
[6] 6.841070e-03 3.355199e-03 6.607662e-04 3.297467e-04 3.291750e-05
[11] 3.291179e-06 3.291122e-07 3.291116e-08
We see that even for 100e5
samples, the difference between the analytical and numerical solution is still significant compared to double floating point precision.
If you desperately need very high precision, I'd try and go for an analytical solution. However, in practice 5000 samples is more than enough to get reasonable accuracy.
Upvotes: 2