DWFotos
DWFotos

Reputation: 33

5th order polynomial not fitting

I am trying to calculate a standard curve for concentration and MFI (median fluorescence intensity) values and apply it to determine the concentration on new MFI data. The data are suppose to be fit with a 5th order polynomial function, but I get a weird fit and incorrect results with anything above a 3rd order. I can get a 5th order to fit using Excel, and it predicts as it should. Any thoughts?

#Standard Values:
concentration = c(2500.0, 1250.0,  625.0,  312.5,  156.0,   80.0,   40.0,   20.0,    0.0)
MFI = c(8414, 3902, 1355,  928,  555,  324,  253,  187,  137)

# Code to fit: 
Standards = data_frame(MFI, concentration)
poly_fit = lm(concentration ~ poly(MFI, degree = 5), data = Standards)

EDIT: the Excel formula being used is =LINEST(Concentration Rows,MFI Rows^{1,2,3,4,5},TRUE,TRUE)

This is the "weird" fit I am getting in R:

enter image description here

#The MFI data to fit with the function:
 samples = c(2951.0,  3197.0,  3141.0, 13166.0, 12646.0, 12869.0,  9395.5,  9681.0,  9785.0,  9513.0,  9133.0,  9430.0,  6798.0,  5935.0,  5749.0)

#The accompanying known concentration values for the samples:
sample_knowns = c(994.4858, 1076.0902, 1057.5298, 4299.3047, 4133.6606, 4204.7194, 3093.5100, 3185.2317, 3218.6245, 3131.2682, 3009.1081, 3104.5978, 2255.0522, 1974.6138, 1914.0260)

Upvotes: 1

Views: 255

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226097

I'm going to follow up on @user20650's comments and suggest that what you're seeing is a standard case of the instability of polynomial regression. As @Vincent suggests, the predictions are fine (the predicted curves go through all the points, extremely accurately for degree=4 and 5), but the curves are indeed silly. I would be very surprised if R were making a mistake here; can you include some more about the details of your Excel fitting procedure in your question? Is the polynomial regression fit stabilized somehow ... ??

nS <- data.frame(MFI=seq(0,max(MFI), length.out=101))
par(las=1,bty="l")
plot(concentration~MFI, data=Standards,pch=16,cex=2,log="x")
for (d in 1:5) {
    poly_fit = lm(concentration ~ poly(MFI, degree = d), data = Standards)
    lines(nS$MFI, predict(poly_fit, newdata=nS), col=d+1, lwd=2)
}

enter image description here

Upvotes: 2

Vincent
Vincent

Reputation: 17725

The only weird thing I see in your code is that you use data_frame with an underscore instead of data.frame with a dot. Otherwise, the example works fine. The in-sample predictions are very close to the observed values as shown in this plot:

concentration = c(2500.0, 1250.0, 625.0, 312.5, 156.0, 80.0, 40.0, 20.0, 0.0)
MFI = c(8414, 3902, 1355, 928, 555, 324, 253, 187, 137)
Standards = data.frame(concentration, MFI)

poly_fit = lm(concentration ~ poly(MFI, degree = 5), data = Standards)

summary(poly_fit)
#> 
#> Call:
#> lm(formula = concentration ~ poly(MFI, degree = 5), data = Standards)
#> 
#> Residuals:
#>          1          2          3          4          5          6          7 
#>  2.432e-05 -1.931e-03  3.455e-01 -1.407e+00  1.971e+00  4.675e+00 -8.415e+00 
#>          8          9 
#> -4.558e-01  3.288e+00 
#> 
#> Coefficients:
#>                        Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)             553.722      2.015 274.754 1.06e-07 ***
#> poly(MFI, degree = 5)1 2348.794      6.046 388.486 3.76e-08 ***
#> poly(MFI, degree = 5)2 -164.330      6.046 -27.180 0.000109 ***
#> poly(MFI, degree = 5)3  135.835      6.046  22.467 0.000193 ***
#> poly(MFI, degree = 5)4   99.164      6.046  16.402 0.000493 ***
#> poly(MFI, degree = 5)5   42.537      6.046   7.035 0.005900 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 6.046 on 3 degrees of freedom
#> Multiple R-squared:      1,  Adjusted R-squared:  0.9999 
#> F-statistic: 3.05e+04 on 5 and 3 DF,  p-value: 2.963e-07

plot(Standards$concentration,
     predict(poly_fit))

Upvotes: 2

Related Questions