Reputation: 45
I am trying to fit a non-linear model using nls
in R, however, I can't seem to get the model specification correct. Any advice on how to correct the model, would be much appreciated. A minimal example below.
# Import dummy data
data <- read.csv("https://raw.githubusercontent.com/guysutton/nls_singular_repo/main/data2.csv")
head(data)
I think the specification of the fixed effects is what is causing the issue. I am not experienced with fitting 'nls' and picking starting values, as several other StackOverflow posts have indicated that the error I am getting could be because of poor starting values. I have tried starting pH(1:3, in increments of 0.1).
# Fit NLS model
mod <- stats::nls(
# Response variable
y ~
# Predictors
( (x1 + x2) / (-0.6258*pH^4+11.847*pH^3-79.884*pH^2+230.97*(pH)-242.64) ),
start = list(pH = 1),
control = nls.control(maxiter = 1000),
data = data
)
Error in nls(y ~ ((x1 + x2)/(-0.6258 * pH^4 + 11.847 * pH^3 - 79.884 * : singular gradient
However, when I take the fixed effect expression and manually input values for the first row of data, I get the correct value.
# This produces the correct value for the first row of data when I sub in the values
(x1 + x2) / (-0.6258*pH^4+11.847*pH^3-79.884*pH^2+230.97*(pH)-242.64)
(1.50 + 0.45) / (-0.6258*4.41^4+11.847*4.41^3-79.884*4.41^2+230.97*(4.41)-242.64)
[1] 1.132759
Any advice on what might be causing my issue, and how to potentially solve this issue, would be much appreciated.
Upvotes: 0
Views: 80
Reputation: 8110
The problem (as pointed out in the comments) is that you are trying to determine the coefficients for your model with nls
. The way you currently have the model set up, you are trying to predict a single pH value that describes all the variability in y given x1 and x2. I think what you really want to to determine the coefficients for your input pH values that give the best model. Maybe something like:
library(tidyverse)
mod <- stats::nls(
# Response variable
y ~
# Predictors
( (x1 + x2) /(k*pH + k2*pH^2 + k3*pH^3 + k4*pH^4) ),
start = list(k = 10, k2 = 1, k3 = 0.3, k4 = 0.03),
control = nls.control(maxiter = 1000, tol = 1e-3, minFactor = 1e-5),
data = data
)
summary(mod)
#>
#> Formula: y ~ ((x1 + x2)/(k * pH + k2 * pH^2 + k3 * pH^3 + k4 * pH^4))
#>
#> Parameters:
#> Estimate Std. Error t value Pr(>|t|)
#> k -113.65175 4.31320 -26.35 <2e-16 ***
#> k2 82.57272 3.20163 25.79 <2e-16 ***
#> k3 -19.84041 0.78430 -25.30 <2e-16 ***
#> k4 1.58837 0.06349 25.02 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.6587 on 5076 degrees of freedom
#>
#> Number of iterations to convergence: 13
#> Achieved convergence tolerance: 0.0007377
data|>
mutate(mod = predict(mod)) |>
ggplot(aes(y, mod))+
geom_point()+
scale_x_log10()+
scale_y_log10()
Upvotes: 1