Reputation: 1090
Because I wanted to have the nls model separately, I did a fit to my data inside the geom_smooth function and outside ggplot:
library(ggplot2)
set.seed(1)
data <- data.frame(x=rnorm(100))
a <- 4
b <- -2
data$y <- with(data, exp(a + b * x) + rnorm(100) + 100)
mod <- nls(formula = y ~ (exp(a + b * x)), data = data, start = list(a = a, b = b))
data$fit <- predict(mod, newdata=data)
plot <- ggplot(data, aes(x=x, y=y)) +
geom_point() +
geom_smooth(method = "nls", colour = "red", formula=y ~ exp(a + b * x),
method.args = list(start = c(a = a, b = b)), se=F, span=0) +
geom_line(aes(x=x, y=fit), colour="blue") +
scale_y_log10()
I just wondering why both methods, though with the same parameters, give a different fit? Does geom_smooth use some transformation?
Upvotes: 3
Views: 1024
Reputation: 36084
geom_smooth
doesn't make predictions from the original dataset, but instead makes a dataset for prediction. By default this dataset has 80 rows, but you can change this with the n
argument.
To see that the model fit via geom_smooth
and the model fit by nls
are the same, you need to use the same dataset for prediction. You can pull the one used by geom_smooth
out via ggplot_build
. The dataset used for prediction is the second in the list.
dat2 = ggplot_build(plot)$data[[2]]
Now use dat2
for making predictions from the nls model and remake the plot.
dat2$fit2 = predict(mod, newdata = dat2)
ggplot(data, aes(x=x, y=y)) +
geom_point() +
geom_smooth(method = "nls", colour = "red", formula=y ~ exp(a + b * x),
method.args = list(start = c(a = 4, b = -2)), se = FALSE) +
geom_line(data = dat2, aes(x=x, y=fit2), colour="blue")
Note that if you want to display on the log10 scale when comparing geom_smooth
to a predicted line you'll want to use coord_trans(y = "log10")
instead of scale_y_log10
. Scale transformation happens prior to model fitting, so you would be fitting a model to a log10-transformed y if you use scale_y_log10
.
Upvotes: 9