c0bra
c0bra

Reputation: 1090

geom_smooth gives different fit than nls alone

Because I wanted to have the nls model separately, I did a fit to my data inside the geom_smooth function and outside ggplot:

library(ggplot2)
set.seed(1)
data <- data.frame(x=rnorm(100))
a <- 4
b <- -2
data$y <- with(data, exp(a + b * x) + rnorm(100) + 100)
mod <- nls(formula = y ~ (exp(a + b * x)), data = data, start = list(a = a, b = b))
data$fit <- predict(mod, newdata=data)

plot <- ggplot(data, aes(x=x, y=y)) + 
    geom_point() + 
    geom_smooth(method = "nls", colour = "red", formula=y ~ exp(a + b * x),
                method.args = list(start = c(a = a, b = b)), se=F, span=0) + 
    geom_line(aes(x=x, y=fit), colour="blue") +
    scale_y_log10()

enter image description here

I just wondering why both methods, though with the same parameters, give a different fit? Does geom_smooth use some transformation?

Upvotes: 3

Views: 1024

Answers (1)

aosmith
aosmith

Reputation: 36084

geom_smooth doesn't make predictions from the original dataset, but instead makes a dataset for prediction. By default this dataset has 80 rows, but you can change this with the n argument.

To see that the model fit via geom_smooth and the model fit by nls are the same, you need to use the same dataset for prediction. You can pull the one used by geom_smooth out via ggplot_build. The dataset used for prediction is the second in the list.

dat2 = ggplot_build(plot)$data[[2]]

Now use dat2 for making predictions from the nls model and remake the plot.

dat2$fit2 = predict(mod, newdata = dat2)

ggplot(data, aes(x=x, y=y)) + 
    geom_point() + 
    geom_smooth(method = "nls", colour = "red", formula=y ~ exp(a + b * x),
              method.args = list(start = c(a = 4, b = -2)), se = FALSE) + 
    geom_line(data = dat2, aes(x=x, y=fit2), colour="blue")

enter image description here

Note that if you want to display on the log10 scale when comparing geom_smooth to a predicted line you'll want to use coord_trans(y = "log10") instead of scale_y_log10. Scale transformation happens prior to model fitting, so you would be fitting a model to a log10-transformed y if you use scale_y_log10.

Upvotes: 9

Related Questions