tfut22312
tfut22312

Reputation: 21

how can I estimate a series of linear segments to fit an exponential curve?

This might be more of a math question, but ultimately I'd like to perform this in R. If I have a basic exponential curve, I'd like to understand how to use R to apply a series of linear functions to fit the exponential curve as best I can. The reason is the linear line is a particular relationship and the lines represent a rate of change, at each inflection point the rate of change increases. These inflection points are important for the user to know. I have a crude drawing of what I am trying to accomplish attached.

Exponential Curve with Linear Lines

The black line is the exponential curve, the red lines are the series of linear lines, and the orange circles represent of course where the lines intersect. I can perform this task in a haphazard way by just picking arbitrary data points and building linear models until I find a combination that I feel best fits the exponential curve, but I know there is a better way than that.

Here is some code that might help:

data <- c(1:34)
sales <- c(20000000, 25000000,  30000000,   35000000,   43000000,    
50000000,   57000000,   65000000,   72000000,   80000000,   89000000,    
97000000,   108000000,  118000000,  128000000,  138000000,  150000000,   
161000000,  174000000,  187000000,  203000000,  218000000,  235000000,   
251000000,  260000000,  280000000   ,293000000, 310000000,  333000000,   
363000000,  390000000,  415000000,  454000000,  540000000)
data2 <- data.frame(data,sales)

plot(data2$data,data2$sales)

plot of exponential curve as data

Upvotes: 2

Views: 720

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226067

With the segmented package (see this question):

library(segmented)
m1 <- lm(sales ~ data, data = data2)  ## initial fit
s1 <- segmented(m1)     ## one breakpoint
s2 <- segmented(m1, psi = c(10,25))  ## two breakpoints, estimated starting values
plot(sales ~ data, data = data2)
lines(data2$data, predict(s1))
lines(data2$data, predict(s2), col = 2, lwd =2)

Results:

s2
Call: segmented.lm(obj = m1, psi = c(10, 25))

Meaningful coefficients of the linear terms:
(Intercept)         data      U1.data      U2.data  
    5942857      7732143      7105220     26962637  

Estimated Break-Point(s):
psi1.data  psi2.data  
    15.72      29.65  

Unlike @JJacquelin's provided solution, you do need to provide starting values for the breakpoints when estimating >1 breakpoint, but they only need to be something reasonable — especially for simple/well-behaved data, the results will be (nearly) identical for a range of similar starting value choices.

data with predictions from segmented fits

Mathematically, I would be picky and say that an exponential curve doesn't really have an inflection point — the slope continuously and gradually increases — but if this is a useful way to convey something to an audience, go for it.

Upvotes: 3

JJacquelin
JJacquelin

Reputation: 1705

This is a problem of fitting a piecewise function made of three linear segments.

A very simple method (not iterative, no initial guess required) is explained in https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf .The convenient case in treated pp.20-22

A numerical example is given below. The next figure shows the result :

x = 0.17 0.23 0.293 0.349 0.401 0.457 0.509 0.563 0.619 0.668 0.713 0.756 0.798 0.832 0.864 0.889 0.912 0.935 0.957 0.977

y = 0.09 0.094 0.09 0.067 0.082 0.114 0.141 0.173 0.212 0.247 0.278 0.325 0.408 0.459 0.518 0.584 0.631 0.698 0.78 0.859

enter image description here

In order to make easier the implementation of the code and the checking, the calculus is shown below in full details :

enter image description here

The criteria of fitting is least mean square error for the whole data in one shot (not segment by segment).

NOTE : The above example was chosen with few points (20). This was in interest of easier checking. The drawback is that the low number of points wrt the number of parameters(5) to be optimised is a risk of failure or deviation. The method is based on numerical integration which requires as many points as possible for a better accuracy.

Upvotes: 2

Related Questions