Scratch
Scratch

Reputation: 47

Exponential decay fit in r

I would like to fit an exponential decay function in R to the following data:

data <- structure(list(x = 0:38, y = c(0.991744340878828, 0.512512332368168, 
0.41102449265681, 0.356621905557202, 0.320851602373477, 0.29499198506227, 
0.275037747162642, 0.25938850981822, 0.245263623938863, 0.233655093612007, 
0.224041426946405, 0.214152907133301, 0.207475138903635, 0.203270738895484, 
0.194942528735632, 0.188107106969046, 0.180926819430008, 0.177028560207711, 
0.172595416846822, 0.166729221891201, 0.163502461048814, 0.159286528409165, 
0.156110097827889, 0.152655498715612, 0.148684858095915, 0.14733605355542, 
0.144691873223729, 0.143118852619617, 0.139542186417186, 0.137730138713745, 
0.134353615271572, 0.132197800438632, 0.128369567159113, 0.124971834736476, 
0.120027536018095, 0.117678812415655, 0.115720611113327, 0.112491329844252, 
0.109219168085624)), class = "data.frame", row.names = c(NA, 
-39L), .Names = c("x", "y"))

I've tried fitting with nls but the generated curve is not close to the actual data.

enter image description here

It would be very helpful if anyone could explain how to work with such nonlinear data and find a function of best fit.

Upvotes: 2

Views: 12838

Answers (3)

G. Grothendieck
G. Grothendieck

Reputation: 270438

Try y ~ .lin / (b + x^c). Note that when using "plinear" one omits the .lin linear parameter when specifying the formula to nls and also omits a starting value for it.

Also note that the .lin and b parameters are approximately 1 at the optimum so we could also try the one parameter model y ~ 1 / (1 + x^c). This is the form of a one-parameter log-logistic survival curve. The AIC for this one parameter model is worse than for the 3 parameter model (compare AIC(fm1) and AIC(fm3)) but the one parameter model might still be preferable due to its parsimony and the fact that the fit is visually indistinguishable from the 3 parameter model.

opar <- par(mfcol = 2:1, mar = c(3, 3, 3, 1), family = "mono")

# data = data.frame with x & y col names; fm = model fit; main = string shown above plot
Plot <- function(data, fm, main) {
  plot(y ~ x, data, pch = 20)
  lines(fitted(fm) ~ x, data, col = "red")
  legend("topright", bty = "n", cex = 0.7, legend = capture.output(fm))
  title(main = paste(main, "- AIC:", round(AIC(fm), 2)))
}  

# 3 parameter model
fo3 <- y ~ 1/(b + x^c) # omit .lin parameter; plinear will add it automatically
fm3 <- nls(fo3, data = data, start = list(b = 1, c = 1), alg = "plinear")
Plot(data, fm3, "3 parameters")

# one parameter model
fo1 <- y ~ 1 / (1 + x^c)
fm1 <- nls(fo1, data, start = list(c = 1))
Plot(data, fm1, "1 parameter")

par(read.only = opar)

screenshot

AIC

Adding the solutions in the other answers we can compare the AIC values. We have labelled each solution by the number of parameters it uses (the degrees of freedom would be one greater than that) and have reworked the log-log solution to use nls instead of lm and have a LHS of y since one cannot compare the AIC values of models having different left hand sides or using different optimization routines since the log likelihood constants used could differ.

fo2 <- y ~ exp(a + b * log(x+1))
fm2 <- nls(fo2, data, start = list(a = 1, b = 1))

fo4 <- y ~ SSbiexp(x, A1, lrc1, A2, lrc2)
fm4 <- nls(fo4, data)

aic <- AIC(fm1, fm2, fm3, fm4)
aic[order(aic$AIC), ]

giving from best AIC (i.e. fm3) to worst AIC (i.e. fm2):

    df     AIC
fm3  4 -329.35
fm1  2 -307.69
fm4  5 -215.96
fm2  3 -167.33

Upvotes: 4

Roland
Roland

Reputation: 132999

A biexponential model would fit much better, though still not perfect. This would indicate that you might have two simultaneous decay processes.

fit <- nls(y ~ SSbiexp(x, A1, lrc1, A2, lrc2), data = data)
#A1*exp(-exp(lrc1)*x)+A2*exp(-exp(lrc2)*x)

plot(y ~x, data = data)
curve(predict(fit, newdata = data.frame(x)), add = TRUE)

resulting plot

If the measurement error depends on magnitude, you could consider using it for weighting.

However, you should consider carefully what kind of model you'd expect from your domain knowledge. Just selecting a non-linear model empirically is usually not a good idea. A non-parametric fit might be a better option.

Upvotes: 1

Adam Warner
Adam Warner

Reputation: 1354

data <- structure(list(x = 0:38, y = c(0.991744340878828, 0.512512332368168, 
0.41102449265681, 0.356621905557202, 0.320851602373477, 0.29499198506227, 
0.275037747162642, 0.25938850981822, 0.245263623938863, 0.233655093612007, 
0.224041426946405, 0.214152907133301, 0.207475138903635, 0.203270738895484, 
0.194942528735632, 0.188107106969046, 0.180926819430008, 0.177028560207711, 
0.172595416846822, 0.166729221891201, 0.163502461048814, 0.159286528409165, 
0.156110097827889, 0.152655498715612, 0.148684858095915, 0.14733605355542, 
0.144691873223729, 0.143118852619617, 0.139542186417186, 0.137730138713745, 
0.134353615271572, 0.132197800438632, 0.128369567159113, 0.124971834736476, 
0.120027536018095, 0.117678812415655, 0.115720611113327, 0.112491329844252, 
0.109219168085624)), class = "data.frame", row.names = c(NA, 
-39L), .Names = c("x", "y"))

# Do this because the log of 0 is not possible to calculate
data$x = data$x +1

fit = lm(log(y) ~ log(x), data = data)
plot(data$x, data$y)
lines(data$x, data$x ^ fit$coefficients[2], col = "red")

This did a lot better than using the nls forumla. And when plotting the fit seems to do fairly well.

Upvotes: 0

Related Questions