Reputation: 31
So I have a dataset in which I have one variable which is "Date", which goes "1, 2, 3, ..., 182". Then I have another variable which either increases by 1 or 0. So it goes like "0, 1, 2, 2, 2, 3, 4, ... etc up to 95.
I have done a regression and everything works fine. But I can't seem to get the line function or the R-squared value. Usually you will use something like
lm_eq <- lm(var1 ~var2 + var3)
summary(lm_eq)
and you will get all the data. But I have used ggplot for my regression as such:
t1 = ggplot() +
geom_line(data = odds, aes(x = Date, y = RorderOT1), colour="red") +
geom_smooth(data = odds, aes(x = Date, y = RorderOT1), colour="red") +
xlab('Match points') +
ylab('Number of outcomes')
print(t1)
summary(t1)
But here the summary function does not give me the line function, R-squared value or any other results at all.
I have tried looking around, but all the answers are for how to get the results into the graph. I don't want that, I only want the results as you usually get when you do a normal regression in R. I have also tried using the usual coding but that regression does not seem to match with the one ggplot does for me.
So are there any easy way to just get the results as it usually is or do I need to specify something in the code?
Upvotes: 0
Views: 135
Reputation: 173793
You say that you've used ggplot for your regression, but you haven't. You have used ggplot to plot a regression line over your data.
Yes, internally ggplot will have to perform a regression to generate the line, but its job is not to return a model object that you can use to describe your results mathematically. Its job is to draw a line in the right place. The things you see when you do a summary
of a ggplot are a summary of all the many, many elements that go into making a fully customizable plot. The regression line is only one small part of that, and a ggplot object should not be expected to give you all the functionality of a proper linear regression model.
You have already given the answer yourself in the question. You need to do a seperate regression.
First I'll load ggplot and create some reproducible fake data that should closely match yours:
library(ggplot2)
set.seed(69)
odds <- data.frame(Date = 1:182, RorderOT1 = cumsum(rbinom(182, 1, 0.5)))
Then we create the plot. Note if you want a straight line for your regression you need to specify method = "lm"
in geom_smooth
:
ggplot(data = odds) +
geom_line(aes(x = Date, y = RorderOT1), colour="red") +
geom_smooth(method = "lm", aes(x = Date, y = RorderOT1), colour="red") +
xlab('Match points') +
ylab('Number of outcomes')
Now you can do your regression seperately:
my_model <- lm(RorderOT1 ~ Date, data = odds)
summary(my_model)
#>
#> Call:
#> lm(formula = RorderOT1 ~ Date, data = odds)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -4.3970 -1.9570 0.1362 1.9126 3.6045
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -1.610528 0.321253 -5.013 1.27e-06 ***
#> Date 0.514329 0.003045 168.924 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.158 on 180 degrees of freedom
#> Multiple R-squared: 0.9937, Adjusted R-squared: 0.9937
#> F-statistic: 2.854e+04 on 1 and 180 DF, p-value: < 2.2e-16
Now you can explore all the information you want about the regression, plug it into a predict
function, compare the fit to other models, see its covariance matrix, etc. because it is a model object.
Created on 2020-05-06 by the reprex package (v0.3.0)
Upvotes: 1