faith
faith

Reputation: 25

plot separate curves for non-linear (nls) model with categorical variable and separate parameter values

I have a dataset of age and length, plus some categorical variables including sex and location (2 level factor). I have fit a Gompertz model to this, using nls():

gompertz <- nls(Length~a*exp(-b*exp(-c*age)), 
                     data=df, 
                     start=list(a=155,b=0.4, c=0.1)) 

but have been struggling to include a categorical variable as well - i.e. I want to compare the growth rates between 2 locations. A solution has been suggested here:

https://stats.stackexchange.com/questions/63226/non-linear-modelling-with-several-variables-including-a-categorical-variable?newreg=b3e4387021fe4108ad74075bee2e370a

and this model works - I get separate a, b and c parameter estimates for both levels of my location factor. However I do not know how to go about plotting this - I want to plot my raw data (length by age) which 2 separate prediction curves, 1 for each value of location with the separate parameter estimates.

my model is as follows:

gompertz.locs <- nls(formula = Length ~ 
                 as.numeric(location==1)*a1*exp(-b1*exp(-c1*age)) 
               + as.numeric(location==2)*a2*exp(-b2*exp(-c2*age)),
               data = df, 
               start = list(a1=150,b1=0.5, c1=0.5,
                            a2=150,b2=0.5, c2=0.5))

sample data can be generated using:

ages<- runif(100, 0, 22) #ages 0-22

#parameters for model
a1<-153
b1<-0.51
c1<-0.53

a2<-147
b2<-0.45
c2<-0.43

#generate length with error normally distributed 
length1 <- (a1*exp(-b1*exp(-c1*ages))) +rnorm(100, mean=0, sd=5)
length2 <- (a2*exp(-b2*exp(-c2*ages))) +rnorm(100, mean=0, sd=5)

df<-data.frame(Length=c(length1, length2), age=rep(ages, 2), location=c(rep(1,100), rep(2,100)))

TIA!

Upvotes: 0

Views: 481

Answers (2)

DaveArmstrong
DaveArmstrong

Reputation: 21992

How about this:

library(ggplot2)
ages<- runif(100, 0, 22) #ages 0-22

#parameters for model
a1<-153
b1<-0.51
c1<-0.53

a2<-147
b2<-0.45
c2<-0.43

#generate length with error normally distributed 
length1 <- (a1*exp(-b1*exp(-c1*ages))) +rnorm(100, mean=0, sd=5)
length2 <- (a2*exp(-b2*exp(-c2*ages))) +rnorm(100, mean=0, sd=5)

df<-data.frame(Length=c(length1, length2), age=rep(ages, 2), location=c(rep(1,100), rep(2,100)))


gompertz.locs <- nls(formula = Length ~ 
                       as.numeric(location==1)*a1*exp(-b1*exp(-c1*age)) 
                     + as.numeric(location==2)*a2*exp(-b2*exp(-c2*age)),
                     data = df, 
                     start = list(a1=150,b1=0.5, c1=0.5,
                                  a2=150,b2=0.5, c2=0.5))

a <- coef(gompertz.locs)[c(1,4)]
b <- coef(gompertz.locs)[c(2,5)]
c <- coef(gompertz.locs)[c(3,6)]
df$fit <- a[df$location]*exp(-b[df$location]*exp(-c[df$location]*df$age))

ggplot(df, aes(x=age, y=Length, colour=as.factor(location))) + 
  geom_point() + 
  geom_line(aes(y=fit)) + 
  theme_classic() + 
  labs(colour="Location")

Created on 2022-07-06 by the reprex package (v2.0.1)

Upvotes: 1

G. Grothendieck
G. Grothendieck

Reputation: 269965

First run nls. We can use subscripts to get separate parameter values for each location. Then augment df with the fitted values and convert location to factor so that we can run ggplot2.

st <- list(a = c(155, 155), b = c(0.4, 0.4), c = c(0.1, 0.1))
fo <- Length ~ a[location] * exp(-b[location] * exp(-c[location] * age))
fm <- nls(fo, df, start = st)

library(ggplot2)

df2 <- transform(df, fitted = fitted(fm), location = factor(location))
ggplot(df2, aes(age, Length, col = location)) + 
  geom_point() + 
  geom_line(aes(y = fitted))
  

screenshot

Note

The input in the question is not reproducible due to the use of random numbers without set.seed so we used this.

set.seed(123)
ages<- runif(100, 0, 22) #ages 0-22

#parameters for model
a1<-153
b1<-0.51
c1<-0.53

a2<-147
b2<-0.45
c2<-0.43

#generate length with error normally distributed 
length1 <- (a1*exp(-b1*exp(-c1*ages))) +rnorm(100, mean=0, sd=5)
length2 <- (a2*exp(-b2*exp(-c2*ages))) +rnorm(100, mean=0, sd=5)

df <- data.frame(Length=c(length1, length2), age=rep(ages, 2), 
  location=c(rep(1,100), rep(2,100)))

Upvotes: 4

Related Questions