Nakx
Nakx

Reputation: 1600

Speed up iterations of piecewise regressions with the segmented function in R

I am trying to fit about 300 piecewise regressions with the segmented function from the segmented package in R. This is taking a lot of time (~4days) because of the segmented function. I am already using all the cores of my computer, but I am not a programmer and I guess this code is probably not optimal. Can I improve the code below to make it run faster? How?

Here is a reproducible example. df is a simulated data frame that corresponds to one of the 300 datasets that I want to analyze. Each dataset is one day, and during each day I measure the temperature every 5 minutes, x is the temperature and y the time of the day. The figure below shows what my data look like. The pattern is very specific and repeatable across days and each change in slope corresponds to well understood biological mechanisms. This is why I can guess all the values of psi (for ex. time of sunrise and sunset).

Of course the real data are more variable and I use many iterations (about 200, here I reduced to 10 for the example) to increase my chances of getting a successful fit.

library(segmented)
y<-seq(1,288,1)
x<-c(seq(0,-30,-1),seq(-30,-54,-2),seq(-54,30,1),seq(30,10,-1),seq(10,90,1),seq(90,34,-1))
df<-data.frame(x,y)
head(df)

plot(x~y)
t1=31
t2=44
t3=129
t4=150
t5=231

iterations<-10
for (j in 1:iterations) {           
  res <- lm(formula=x~y,data=df)      
  try(result <- segmented(
       res, seg.Z=~y, psi=c(t1,t2,t3,t4,t5),
       control=seg.control(it.max=200, display=F, K=4, h=0.1, n.boot=100, random=T)))          
}

result

Temperature ~ time of the day

Taking the lm out of the loop doesn't significantly improve the speed of the loop.

Upvotes: 1

Views: 326

Answers (1)

kuppern87
kuppern87

Reputation: 1135

One thing that should help is to break out of the iterations once the result is found. In most cases it should find something on the first iteration and this will avoid running 200 unnecessary iterations.

rm(result)
for (j in 1:iterations) {           
  res <- lm(formula=x~y,data=df)
  try(result <- segmented(
       res, seg.Z=~y, psi=c(t1,t2,t3,t4,t5),
       control=seg.control(it.max=200, display=F, K=4, h=0.1, n.boot=100, random=T)))
  if (exists("result")) break
}

Upvotes: 2

Related Questions