snair.stack
snair.stack

Reputation: 415

Calculating area between two plots in R

I am trying to find area formed between a barplot and a curve/normal plot in R. I've used ggplot2 package for all plotting purposes and used gglocator to identify the coordinates. But I am having trouble figuring out how to calculate area between the curves. The barplot will remain constant but the curve will change (as it is each row of a df).

Here's a reproducible code similar to my problem:

require(ggplot2)
require(ggmap)

x1 <- seq(1, 1000, 25)
x2 <- rnorm(40, mean = 1, sd = 0.25)
df <- data.frame(x1, x2)
bardf <- data.frame(x = c(150,500,750), 
                    height = c(1.4, 1.4, 1.2), 
                    width = c(50,70,90))
p <- ggplot() + 
    geom_bar(data = bardf, aes(x,height, width = width), fill = "white", stat = "identity") +
    geom_line(data = df, aes(x1,x2))

print(p)
gglocator()

And this is the plot: to find: area between barplot and under the curve

To find: area between barplot and under the curve (please ignore the red circle). Anybody have any idea how to proceed with this challenge. I found couple of questions in SO regarding calculating area but most of them were for ROC or just about shading the region. Any suggestions/ideas will be much appreciated.

Upvotes: 1

Views: 1380

Answers (1)

alistaire
alistaire

Reputation: 43334

If you use approxfun to build a function that will interpolate points, you can use integrate to calculate the area. If the bar can be lower than the line, pmin can return the lower of the heights:

library(ggplot2)
set.seed(1)    # returns a line partially higher than a bar

df <- data.frame(x1 = seq(1, 1000, 25), 
                 x2 = rnorm(40, mean = 1, sd = 0.25))
bardf <- data.frame(x = c(150,500,750), 
                    height = c(1.4, 1.4,1.2), 
                    width = c(50,70,90))

ggplot() + 
  geom_col(data = bardf, aes(x, height, width = width), fill = "white") +
  geom_line(data = df, aes(x1, x2))

# iterate in parallel over bardf to calculate all areas at once
mapply(function(x, h, w){
    integrate(function(v){pmin(approxfun(df$x1, df$x2)(v), h)}, 
              lower = x - .5 * w, 
              upper = x + .5 * w
    )$value}, 
    bardf$x, bardf$height, bardf$width)
#> [1] 52.40707 83.28773 98.38771

Upvotes: 2

Related Questions