Reputation: 1162
I need to do an nls fit for crop/year data (see Question on nls fit in R - why is this such a strange fit?).
I am starting with this data and using the coefficients from a regular linear model fit as the initial parameter estimates.
x <- c(1979L, 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L,
1988L, 1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1997L, 1998L,
2000L, 2001L, 2002L, 2003L, 2004L, 2012L, 2014L, 2018L)
y <- c(94.4, 100, 120.6, 97.3, 110, 110, 80, 110, 117, 115, 50, 120,
68.4, 137, 83, 106, 124, 113, 95.8, 115.7, 60, 105, 60, 74, 95.7,
100)
mod_lm <- lm(y~x)
I am now getting this error with the minpack.lm package,
library(minpack.lm)
mod_nls <- nlsLM(y ~ a+x^b, start=list(a = mod_lm$coefficients[1],b=mod_lm$coefficients[2]))
Error in nlsModel(formula, mf, start, wts) :
singular gradient matrix at initial parameter estimates
or this one, if I try the nls2 package.
library(nls2)
mod_nls <- nls2(y ~ a+x^b, start=list(a = mod_lm$coefficients[1],b=mod_lm$coefficients[2]))
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
I'm assuming the problem is poor initial parameter estimates. How do I get to better parameter estimates initially? Thank you.
Upvotes: 0
Views: 150
Reputation: 269441
We can use nls2 to get an approximate fit and we see that it is nearly a horizontal line. Using the coefficient for b shown below gives approximately x^b = 1 so the second term is constant and can be absorbed into a
and we are just as well off using the model y ~ a
and in this new reduced model a
is estimated as mean(y)
. The point is that any sufficiently large negative value of b
will give this result so the model is overparameterized and we can reduce it to just the a
term.
library(nls2)
fm <- nls2(y ~ a + x^b, start = list(a = range(y), b = c(-10, 10)), alg = "brute")
fm
## Nonlinear regression model
## model: y ~ a + x^b
## data: parent.frame()
## a b
## 99.714 -4.286
## residual sum-of-squares: 12179
##
## Number of iterations to convergence: 64
## Achieved convergence tolerance: NA
plot(y ~ x)
lines(fitted(fm) ~ x, col = "red")
Upvotes: 0
Reputation: 11046
Part of the problem is that there is no strong relationship in your data:
cor(x, y)
# [1] -0.219275
cor(x, y, method="spearman")
# [1] -0.2145792
cor(x, y, method="kendall")
# [1] -0.117833
plot(y~x)
You can see from the correlations and the plot that there is a weak negative association between x and y. There is essentially no strong evidence that a nonlinear fit would be superior to the simple linear regression.
The plot is one line of evidence and the rank correlations are another. If the relationship was a simple polynomial, then the Spearman and Kendall correlations would be larger than the Pearson correlation. As @Erik Krantz points out, letting the functions guess at the initial values (1, 1) works better than using the linear regression coefficients.
Here is a plot of the line described by your initial conditions compared to the data:
plot(x, coef(mod_lm)[1] + x^coef(mod_lm)[2], ylim=c(0, 1000), type="l")
points(x, y)
Just looking at the plot shows you that the first value is much too large and that a starting value around 100 would be better:
mod_nls <- nlsLM(y ~ a+x^b, start=list(a = 100, b = coef(mod_lm)[2]))
mod_nls
# Nonlinear regression model
# model: y ~ a + x^b
# data: parent.frame()
# a b.x
# 98.1638 -0.1306
# residual sum-of-squares: 12142
#
# Number of iterations to convergence: 13
# Achieved convergence tolerance: 1.49e-08
Upvotes: 1