How can I know the mean value of a function in an interval?

Question

Suppose I have a function, something like:

fun <- function(x) 2.3*exp(-x/2)

and I want to get the mean value of this function along an interval, suppose from 2 to 20.

to get the mean, the first that comes to my mind is this:

mean(fun(2:20))

so as simple as start giving values to the function and computing the mean.

However I wonder if is there any other way more precise to obtain this. Any idea?

Paul Hiemstra · Accepted Answer

Analytically, you can determine the mean value of a function on the interval [a,b] using:

enter image description here

So, after taking the integral, you can evaluate the function at two points and get the mean value analytically. In your case this leads to an integral of -4.6 * exp(0.5 * x), and a mean value of 1/(20-2) * (-4.6 * exp(-0.5 * 20) + 4.6 * exp(-0.5 * 2)) = 0.09400203.

Now I focus now on sampling along the interval, and calculating the mean like that:

get_sample_mean_from_function = function(func, interval, n = 1000) {
   interval_samples = seq(interval[1], interval[2], length = n)
   function_values = sapply(interval_samples, func)
   return(mean(function_values))
}

fun <- function(x) 2.3*exp(-x/2)
get_sample_mean_from_function(fun, interval = c(2,20))

By increasing the number n (number of samples taken) you can increase the precision of your answer. This is how the mean value develops with increasing sample size:

n_list = c(1,4,10,15,25,50,100,500,1000,10e3,100e3,100e4,100e5)
mean_list = sapply(n_list, 
                   function(x) get_sample_mean_from_function(fun, 
                                    interval = c(2,20), n = x))
library(ggplot2)
qplot(n_list, mean_list, geom = "point", log = "x")

enter image description here

Notice that it takes at least 1000 samples to get any convergence. If we compare this numerical solution with the analytical value:

mean_list - real_value
 [1] 7.521207e-01 1.286106e-01 3.984653e-02 2.494165e-02 1.421951e-02
 [6] 6.841070e-03 3.355199e-03 6.607662e-04 3.297467e-04 3.291750e-05
[11] 3.291179e-06 3.291122e-07 3.291116e-08

We see that even for 100e5 samples, the difference between the analytical and numerical solution is still significant compared to double floating point precision.

If you desperately need very high precision, I'd try and go for an analytical solution. However, in practice 5000 samples is more than enough to get reasonable accuracy.

How can I know the mean value of a function in an interval?

Answers (1)

Related Questions