Lara
Lara

Reputation: 73

How to find appropriate model to fit the non linear curve of current data in r

I have a dataset: Reg

dist     ED
75  4.9
150 7.6
225 8.9
300 8.8
375 8.1
450 7.3
525 6.5
600 5.8

I want to find a good fitting nonlinear regression model. I've tried:

plot(Reg$ED, Reg$dist)
lines(lowess(Reg$ED,Reg$dist))
m1 <- lm(ED ~poly(dist,2,raw=TRUE),data=Reg)
m2 <- lm(ED~dec+I(dist^2),data=Reg)
summary(m1)
summary(m2)
lines(Reg$PVFD_Mean, predict(m2), col=2)

But I don't know why the lines of regression model don't show in the plot. So I couldn't figure out how to find the best fit model for my data. I also tried fitModel but it also didn't work.

Any help is much appreciated.

Thanks a lot

Upvotes: 2

Views: 1301

Answers (3)

James Phillips
James Phillips

Reputation: 4647

My equation search shows a good fit to a three-parameter inverse Harris yield density equation, "y = x / (a + b * pow(x, c))", with parameters a = 1.4956613575678071E+01, b = 7.8559465184281589E-05, and c = 2.1768293119284090E+00 giving RMSE = 0.1002 and R-squared = 0.9943

plot

Upvotes: 1

AntoniosK
AntoniosK

Reputation: 16121

Here's an option using loess function to build your non-linear model:

dt = read.table(text = "dist     ED
75  4.9
150 7.6
225 8.9
300 8.8
375 8.1
450 7.3
525 6.5
600 5.8", header=T)

# build the model
m = loess(ED ~ dist, data = dt)

# see model summary
summary(m)

# Call:
#   loess(formula = ED ~ dist, data = dt)
# 
# Number of Observations: 8 
# Equivalent Number of Parameters: 4.41 
# Residual Standard Error: 0.06949 
# Trace of smoother matrix: 4.87  (exact)
# 
# Control settings:
# span     :  0.75 
# degree   :  2 
# family   :  gaussian
# surface  :  interpolate     cell = 0.2
# normalize:  TRUE
# parametric:  FALSE
# drop.square:  FALSE 

# plot points and model fit
plot(dt$dist, dt$ED)
lines(dt$dist, m$fitted, col=2)

enter image description here

If you really want to use the lowess function for some reason you can do the following:

plot(dt$dist, dt$ED)
lines(lowess(dt$dist, dt$ED), col = "blue")
lines(lowess(dt$dist, dt$ED, f = 0.5), col = "green")
lines(lowess(dt$dist, dt$ED, f = 0.3), col = "red")

which will give you the same plot, but you have to select a small value for the smoothing parameter f:

enter image description here

The difference between the 2 methods is simply that loess has a smoothing parameter with a good default value (span = 0.75), but lowess has a smoothing parameter with a not good enough default value in your case (f = 2/3).

Upvotes: 2

Rui Barradas
Rui Barradas

Reputation: 76402

In the question the values of dist and ED are sometimes swapped.

m1 <- lm(ED ~ poly(dist, 2, raw = TRUE), data = Reg)
summary(m1)

plot(Reg$dist, Reg$ED)
lines(lowess(Reg$dist, Reg$ED))
lines(Reg$dist, predict(m1), col = 2)

enter image description here

Upvotes: 0

Related Questions