Reputation: 73
I have a dataset: Reg
dist ED
75 4.9
150 7.6
225 8.9
300 8.8
375 8.1
450 7.3
525 6.5
600 5.8
I want to find a good fitting nonlinear regression model. I've tried:
plot(Reg$ED, Reg$dist)
lines(lowess(Reg$ED,Reg$dist))
m1 <- lm(ED ~poly(dist,2,raw=TRUE),data=Reg)
m2 <- lm(ED~dec+I(dist^2),data=Reg)
summary(m1)
summary(m2)
lines(Reg$PVFD_Mean, predict(m2), col=2)
But I don't know why the lines of regression model don't show in the plot. So I couldn't figure out how to find the best fit model for my data. I also tried fitModel
but it also didn't work.
Any help is much appreciated.
Thanks a lot
Upvotes: 2
Views: 1301
Reputation: 4647
My equation search shows a good fit to a three-parameter inverse Harris yield density equation, "y = x / (a + b * pow(x, c))", with parameters a = 1.4956613575678071E+01, b = 7.8559465184281589E-05, and c = 2.1768293119284090E+00 giving RMSE = 0.1002 and R-squared = 0.9943
Upvotes: 1
Reputation: 16121
Here's an option using loess
function to build your non-linear model:
dt = read.table(text = "dist ED
75 4.9
150 7.6
225 8.9
300 8.8
375 8.1
450 7.3
525 6.5
600 5.8", header=T)
# build the model
m = loess(ED ~ dist, data = dt)
# see model summary
summary(m)
# Call:
# loess(formula = ED ~ dist, data = dt)
#
# Number of Observations: 8
# Equivalent Number of Parameters: 4.41
# Residual Standard Error: 0.06949
# Trace of smoother matrix: 4.87 (exact)
#
# Control settings:
# span : 0.75
# degree : 2
# family : gaussian
# surface : interpolate cell = 0.2
# normalize: TRUE
# parametric: FALSE
# drop.square: FALSE
# plot points and model fit
plot(dt$dist, dt$ED)
lines(dt$dist, m$fitted, col=2)
If you really want to use the lowess
function for some reason you can do the following:
plot(dt$dist, dt$ED)
lines(lowess(dt$dist, dt$ED), col = "blue")
lines(lowess(dt$dist, dt$ED, f = 0.5), col = "green")
lines(lowess(dt$dist, dt$ED, f = 0.3), col = "red")
which will give you the same plot, but you have to select a small value for the smoothing parameter f
:
The difference between the 2 methods is simply that loess
has a smoothing parameter with a good default value (span = 0.75
), but lowess
has a smoothing parameter with a not good enough default value in your case (f = 2/3
).
Upvotes: 2
Reputation: 76402
In the question the values of dist
and ED
are sometimes swapped.
m1 <- lm(ED ~ poly(dist, 2, raw = TRUE), data = Reg)
summary(m1)
plot(Reg$dist, Reg$ED)
lines(lowess(Reg$dist, Reg$ED))
lines(Reg$dist, predict(m1), col = 2)
Upvotes: 0