Reputation: 33
I am trying to calculate a standard curve for concentration and MFI (median fluorescence intensity) values and apply it to determine the concentration on new MFI data. The data are suppose to be fit with a 5th order polynomial function, but I get a weird fit and incorrect results with anything above a 3rd order. I can get a 5th order to fit using Excel, and it predicts as it should. Any thoughts?
#Standard Values:
concentration = c(2500.0, 1250.0, 625.0, 312.5, 156.0, 80.0, 40.0, 20.0, 0.0)
MFI = c(8414, 3902, 1355, 928, 555, 324, 253, 187, 137)
# Code to fit:
Standards = data_frame(MFI, concentration)
poly_fit = lm(concentration ~ poly(MFI, degree = 5), data = Standards)
EDIT: the Excel formula being used is =LINEST(Concentration Rows,MFI Rows^{1,2,3,4,5},TRUE,TRUE)
This is the "weird" fit I am getting in R:
#The MFI data to fit with the function:
samples = c(2951.0, 3197.0, 3141.0, 13166.0, 12646.0, 12869.0, 9395.5, 9681.0, 9785.0, 9513.0, 9133.0, 9430.0, 6798.0, 5935.0, 5749.0)
#The accompanying known concentration values for the samples:
sample_knowns = c(994.4858, 1076.0902, 1057.5298, 4299.3047, 4133.6606, 4204.7194, 3093.5100, 3185.2317, 3218.6245, 3131.2682, 3009.1081, 3104.5978, 2255.0522, 1974.6138, 1914.0260)
Upvotes: 1
Views: 255
Reputation: 226097
I'm going to follow up on @user20650's comments and suggest that what you're seeing is a standard case of the instability of polynomial regression. As @Vincent suggests, the predictions are fine (the predicted curves go through all the points, extremely accurately for degree=4 and 5), but the curves are indeed silly. I would be very surprised if R were making a mistake here; can you include some more about the details of your Excel fitting procedure in your question? Is the polynomial regression fit stabilized somehow ... ??
nS <- data.frame(MFI=seq(0,max(MFI), length.out=101))
par(las=1,bty="l")
plot(concentration~MFI, data=Standards,pch=16,cex=2,log="x")
for (d in 1:5) {
poly_fit = lm(concentration ~ poly(MFI, degree = d), data = Standards)
lines(nS$MFI, predict(poly_fit, newdata=nS), col=d+1, lwd=2)
}
Upvotes: 2
Reputation: 17725
The only weird thing I see in your code is that you use data_frame
with an underscore instead of data.frame
with a dot. Otherwise, the example works fine. The in-sample predictions are very close to the observed values as shown in this plot:
concentration = c(2500.0, 1250.0, 625.0, 312.5, 156.0, 80.0, 40.0, 20.0, 0.0)
MFI = c(8414, 3902, 1355, 928, 555, 324, 253, 187, 137)
Standards = data.frame(concentration, MFI)
poly_fit = lm(concentration ~ poly(MFI, degree = 5), data = Standards)
summary(poly_fit)
#>
#> Call:
#> lm(formula = concentration ~ poly(MFI, degree = 5), data = Standards)
#>
#> Residuals:
#> 1 2 3 4 5 6 7
#> 2.432e-05 -1.931e-03 3.455e-01 -1.407e+00 1.971e+00 4.675e+00 -8.415e+00
#> 8 9
#> -4.558e-01 3.288e+00
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 553.722 2.015 274.754 1.06e-07 ***
#> poly(MFI, degree = 5)1 2348.794 6.046 388.486 3.76e-08 ***
#> poly(MFI, degree = 5)2 -164.330 6.046 -27.180 0.000109 ***
#> poly(MFI, degree = 5)3 135.835 6.046 22.467 0.000193 ***
#> poly(MFI, degree = 5)4 99.164 6.046 16.402 0.000493 ***
#> poly(MFI, degree = 5)5 42.537 6.046 7.035 0.005900 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 6.046 on 3 degrees of freedom
#> Multiple R-squared: 1, Adjusted R-squared: 0.9999
#> F-statistic: 3.05e+04 on 5 and 3 DF, p-value: 2.963e-07
plot(Standards$concentration,
predict(poly_fit))
Upvotes: 2