Reputation: 415
I am trying to find area formed between a barplot and a curve/normal plot in R. I've used ggplot2 package for all plotting purposes and used gglocator to identify the coordinates. But I am having trouble figuring out how to calculate area between the curves. The barplot will remain constant but the curve will change (as it is each row of a df).
Here's a reproducible code similar to my problem:
require(ggplot2)
require(ggmap)
x1 <- seq(1, 1000, 25)
x2 <- rnorm(40, mean = 1, sd = 0.25)
df <- data.frame(x1, x2)
bardf <- data.frame(x = c(150,500,750),
height = c(1.4, 1.4, 1.2),
width = c(50,70,90))
p <- ggplot() +
geom_bar(data = bardf, aes(x,height, width = width), fill = "white", stat = "identity") +
geom_line(data = df, aes(x1,x2))
print(p)
gglocator()
To find: area between barplot and under the curve (please ignore the red circle). Anybody have any idea how to proceed with this challenge. I found couple of questions in SO regarding calculating area but most of them were for ROC or just about shading the region. Any suggestions/ideas will be much appreciated.
Upvotes: 1
Views: 1380
Reputation: 43334
If you use approxfun
to build a function that will interpolate points, you can use integrate
to calculate the area. If the bar can be lower than the line, pmin
can return the lower of the heights:
library(ggplot2)
set.seed(1) # returns a line partially higher than a bar
df <- data.frame(x1 = seq(1, 1000, 25),
x2 = rnorm(40, mean = 1, sd = 0.25))
bardf <- data.frame(x = c(150,500,750),
height = c(1.4, 1.4,1.2),
width = c(50,70,90))
ggplot() +
geom_col(data = bardf, aes(x, height, width = width), fill = "white") +
geom_line(data = df, aes(x1, x2))
# iterate in parallel over bardf to calculate all areas at once
mapply(function(x, h, w){
integrate(function(v){pmin(approxfun(df$x1, df$x2)(v), h)},
lower = x - .5 * w,
upper = x + .5 * w
)$value},
bardf$x, bardf$height, bardf$width)
#> [1] 52.40707 83.28773 98.38771
Upvotes: 2