CodeGuy
CodeGuy

Reputation: 28905

Output from scatter3d R script - how to read the equation

I am using scatter3d to find a fit in my R script. I did so, and here is the output:

Call:
lm(formula = y ~ (x + z)^2 + I(x^2) + I(z^2))

Residuals:
     Min       1Q   Median       3Q      Max 
-0.78454 -0.02302 -0.00563  0.01398  0.47846 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.051975   0.003945 -13.173  < 2e-16 ***
x            0.224564   0.023059   9.739  < 2e-16 ***
z            0.356314   0.021782  16.358  < 2e-16 ***
I(x^2)      -0.340781   0.044835  -7.601 3.46e-14 ***
I(z^2)       0.610344   0.028421  21.475  < 2e-16 ***
x:z         -0.454826   0.065632  -6.930 4.71e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.05468 on 5293 degrees of freedom
Multiple R-squared: 0.6129, Adjusted R-squared: 0.6125 
F-statistic:  1676 on 5 and 5293 DF,  p-value: < 2.2e-16

Based on this, what is the equation of the best fit line? I'm not really sure how to read this? Can someone explain? thanks!

Upvotes: 0

Views: 575

Answers (2)

IRTFM
IRTFM

Reputation: 263362

It's not a plane but rather a paraboloid surface (and using 'y' as the third dimension since you used 'z' already):

y =  -0.051975 + x * 0.224564  + z * 0.356314  +
          -x^2 * -0.340781 + z^2 * 0.610344 - x * z * 0.454826 

Upvotes: 1

John Colby
John Colby

Reputation: 22588

This is a basic regression output table. The parameter estimates ("Estimate" column) are the best-fit line coefficients corresponding to the different terms in your model. If you aren't familiar with this terminology, I would suggest reading up on some linear model and regression tutorial. There are thousands around the web. I would also encourage you to play with some simpler 2D simulations.

For example, let's make some data with an intercept of 2 and a slope of 0.5:

# Simulate data
set.seed(12345)
x = seq(0, 10, len=50)
y = 2 + 0.5 * x + rnorm(length(x), 0, 0.1)
data = data.frame(x, y)

Now when we look at the fit, you'll see that the Estimate column shows these same values:

# Fit model
fit = lm(y ~ x, data=data)
summary(fit)
> summary(fit)

Call:
lm(formula = y ~ x, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.26017 -0.06434  0.02539  0.06238  0.20008 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 2.011759   0.030856   65.20   <2e-16 ***
x           0.501240   0.005317   94.27   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.1107 on 48 degrees of freedom
Multiple R-squared: 0.9946, Adjusted R-squared: 0.9945 
F-statistic:  8886 on 1 and 48 DF,  p-value: < 2.2e-16 

Pulling these out, we can then plot the best-fit line:

# Make plot
dev.new(width=4, height=4)
plot(x, y, ylim=c(0,10))
abline(fit$coef[1], fit$coef[2])

enter image description here

Upvotes: 2

Related Questions