Reputation: 83
Thanks to the help I got here, I was able to get a spaghetti plot of curve fits using bootstrapping. I am trying to derive confidence bands from these fitted models. I've had no luck getting something like
quants <- apply(fitted_boot, 1, quantile, c(0.025, 0.5, 0.975))
to work with the following:
library(dplyr)
library(broom)
library(ggplot2)
xdata <- c(-35.98, -34.74, -33.46, -32.04, -30.86, -29.64, -28.50, -27.29, -26.00,
-24.77, -23.57, -22.21, -21.19, -20.16, -18.77, -17.57, -16.47, -15.35,
-14.40, -13.09, -11.90, -10.47, -9.95,-8.90,-7.77,-6.80, -5.99,
-5.17, -4.21, -3.06, -2.29, -1.04)
ydata <- c(-4.425, -4.134, -5.145, -5.411, -6.711, -7.725, -8.087, -9.059, -10.657,
-11.734, NA, -12.803, -12.906, -12.460, -12.128, -11.667, -10.947, -10.294,
-9.185, -8.620, -8.025, -7.493, -6.713, -6.503, -6.316, -5.662, -5.734, -4.984,
-4.723, -4.753, -4.503, -4.200)
data <- data.frame(xdata,ydata)
x_range <- seq(min(xdata), max(xdata), length.out = 1000)
fitted_boot <- data %>%
bootstrap(100) %>%
do({
m <- nls(ydata ~ A*cos(2*pi*((xdata-x_0)/z))+M, ., start=list(A=4,M=-7,x_0=-10,z=30))
f <- predict(m, newdata = list(xdata = x_range))
data.frame(xdata = x_range, .fitted = f)
} )
ggplot(data, aes(xdata, ydata)) +
geom_line(aes(y=.fitted, group=replicate), fitted_boot, alpha=.1, color="blue") +
geom_point(size=3) +
theme_bw()
I thought perhaps geom_ribbon() would be a nice way to go, but I just don't know where to go from here.
Thank you to Axeman for helping on the other post!
Upvotes: 2
Views: 1869
Reputation: 9560
One approach would be to calculate the confidence interval at each x-value and then just plot that. Here, I am using the first value outside of the 2.5th percentile and the 97.5th percentiles, though you could adjust the code as needed.
First, I change to group_by
the xdata
locations (instead of replicates). Then, I arrange
by the .fitted
values so that I can slice
out the values I want (the first outside the percentile cutoffs). Finally, I tag them with which bound I am getting (they always go lower then upper because we sorted).
forConfInt <-
fitted_boot %>%
ungroup() %>%
group_by(xdata) %>%
arrange(.fitted) %>%
slice(c(floor(0.025 * n() )
, ceiling(0.975 * n() ) ) ) %>%
mutate(range = c("lower", "upper"))
This gives:
replicate xdata .fitted range
<int> <dbl> <dbl> <chr>
1 9 -35.98000 -4.927462 lower
2 94 -35.98000 -4.249348 upper
3 9 -35.94503 -4.927248 lower
4 94 -35.94503 -4.257776 upper
5 9 -35.91005 -4.927228 lower
6 94 -35.91005 -4.266334 upper
7 9 -35.87508 -4.927401 lower
8 94 -35.87508 -4.275020 upper
9 9 -35.84010 -4.927766 lower
10 94 -35.84010 -4.283836 upper
# ... with 1,990 more rows
And we can then just add an additional line to the ggplot
call:
ggplot(data, aes(xdata, ydata)) +
geom_line(aes(y=.fitted, group=replicate), fitted_boot, alpha=.1, color="blue") +
# Added confidence interval:
geom_line(aes(y=.fitted, group=range), forConfInt, color="red") +
geom_point(size=3) +
theme_bw()
Gives this plot:
Upvotes: 3