Reputation: 2583
I would like to fit a rational function to my data:
data:
[1] 2.000000 3.000000 2.333333 1.750000 2.000000 1.833333 2.416667 1.916667
[9] 1.750000 2.166667 2.116667 1.916667 1.944444 1.611111 1.722222 1.777778
[17] 1.877778 1.944444 1.958333 1.833333 2.041667 2.020833 1.908333 1.916667
[25] 1.733333 1.833333 1.800000 1.933333 1.893333 1.866667 1.888889 1.805556
[33] 1.833333 1.847222 1.822222 1.805556 1.833333 1.904762 1.880952 1.833333
[41] 1.804762 1.809524 1.708333 1.708333 1.750000 1.708333 1.683333 1.687500
[49] 1.611111 1.666667 1.648148 1.611111 1.611111 1.611111 1.650000 1.600000
[57] 1.650000 1.625000 1.630000 1.616667 1.469697 1.560606 1.590909 1.651515
[65] 1.651515 1.651515 1.513889 1.555556 1.625000 1.638889 1.647222 1.652778
[73] 1.679487 1.717949 1.705128 1.698718.
The model I would like to fit is the following:
Model <- function(t, a, b, c, d) { (a + b*t)/(1 + c*t + d*t^2) }
I know that I firstly have to give a starting list of a, b, c... for nls but I really don't know how to set the parameters. Since I'm not an expert I found in this http://www.css.cornell.edu/faculty/dgr2/teach/R/R_rat.pdf document a useful guide. But at some point it says:
"Given a set of ordered pairs (ti,yi) where in general there are repeated measurements at each value of t, the parameters of a rational function can be fitted by non-linear least-squares estimation, for example with the nls method in R. One we have the four parameters, we can compute the value of t at which this is maximized, by computing the first derivative....".
Although I don't report additional data I have another column that represents time (integers from 1:76 representing years).
Can anyone help me?
Best
E.
Upvotes: 0
Views: 845
Reputation: 270120
The model is not fully specified in the question but assuming the model in the code below and the data shown reproducibly in Note 2 below if we set c = d = 0 then it is a linear model so we can use the coefficients from a linear model fit as starting values:
fm1 <- lm(y ~ t)
st2 <- list(a = coef(fm1)[[1]], b = coef(fm1)[[2]], c = 0, d = 0)
fm2 <- nls(y ~ Model(t, a, b, c, d), start = st2)
giving:
> fm2
Nonlinear regression model
model: y ~ Model(t, a, b, c, d)
data: parent.frame()
a b c d
2.5097712 0.6038808 0.3205409 0.0008663
residual sum-of-squares: 1.684
Number of iterations to convergence: 16
Achieved convergence tolerance: 8.029e-06
Looking at the fit graphically:
# model is shown in red. See Note 1 for fm4 (blue) and fm5 (green) models.
plot(y ~ t)
lines(fitted(fm2) ~ t, col = "red")
lines(fitted(fm4) ~ t, col = "blue")
lines(fitted(fm5) ~ t, col = "green")
legend("topright", c("fm2", "fm4", "fm5"), col = c("red", "blue", "green"), lty = 1)
The following is a different model which fits almost as well but only uses 3 parameters. See blue line on graph above.
fm3 <- lm(log(y) ~ log(t))
st4 <- list(a = coef(fm3)[[1]], b = 0, c = coef(fm3)[[2]])
fm4 <- nls(y ~ exp(a + b/t + c*log(t)), start = st4)
> fm4
Nonlinear regression model
model: y ~ exp(a + b/t + c * log(t))
data: parent.frame()
a b c
0.9845 -0.1767 -0.1157
residual sum-of-squares: 1.685
Number of iterations to convergence: 4
Achieved convergence tolerance: 2.625e-06
Also this model isn't too bad. It uses only two parameters, it is linear in them and it has a residual sum of squares of 1.728837 compared to 1.684 for the fm2 model and 1.685 for the fm4 model. See green line on graph above.
fm5 <- lm(y ~ log(t))
> fm5
Call:
lm(formula = y ~ log(t))
Coefficients:
(Intercept) log(t)
2.4029 -0.1793
> deviance(fm5)
[1] 1.728837
y <- c(2, 3, 2.333333, 1.75, 2, 1.833333, 2.416667, 1.916667, 1.75,
2.166667, 2.116667, 1.916667, 1.944444, 1.611111, 1.722222, 1.777778,
1.877778, 1.944444, 1.958333, 1.833333, 2.041667, 2.020833, 1.908333,
1.916667, 1.733333, 1.833333, 1.8, 1.933333, 1.893333, 1.866667,
1.888889, 1.805556, 1.833333, 1.847222, 1.822222, 1.805556, 1.833333,
1.904762, 1.880952, 1.833333, 1.804762, 1.809524, 1.708333, 1.708333,
1.75, 1.708333, 1.683333, 1.6875, 1.611111, 1.666667, 1.648148,
1.611111, 1.611111, 1.611111, 1.65, 1.6, 1.65, 1.625, 1.63, 1.616667,
1.469697, 1.560606, 1.590909, 1.651515, 1.651515, 1.651515, 1.513889,
1.555556, 1.625, 1.638889, 1.647222, 1.652778, 1.679487, 1.717949,
1.705128, 1.698718)
t <- seq_along(y)
Upvotes: 3
Reputation: 2188
I believe you can do something like
nls(y ~ (a1 + b1*times) / (1 + c1*times + d1*times^2))
where y
is the data you gave above and times=1:76
. I added a 1
to the parameters, because nls
did not recognize c
as a parameter, but rather as the function c()
.
However, when I ran this, I get an error of singular gradient
and suggests initialize the starting values to something other than 1
(the default). You can specify the starting values with the argument start = list("a1"=0.1, "b1"=0.1, "c1"=0.1, "d1"=0.1)
, but this didn't seem to help. Perhaps you have a better idea of what the starting values should be?
Upvotes: 0