Mike.Gahan
Mike.Gahan

Reputation: 4615

Trouble Extracting Model Summary Statistics

I am having some trouble extracting some model summary statistics from R.

For examples sake, let us use the iris dataset.

data(iris)
model1 <- summary(lm(Sepal.Length~Sepal.Width,data=iris))

I would like to extract r.squared and adj.r.squared from the summary statistics.

lapply(model1, "[", c("r.squared", "adj.r.squared"))
Error in terms.formula(newformula, specials = names(attr(termobj, "specials"))) : 
  invalid model formula in ExtractVars

I am confused, because the following seems to work fine:

model1[c('r.squared', 'adj.r.squared')]
# $r.squared
# [1] 0.01382265
# 
# $adj.r.squared
# [1] 0.007159294

Does someone understand this error? Thanks so much for any help you can provide.

Upvotes: 0

Views: 263

Answers (2)

r2evans
r2evans

Reputation: 160417

Alternatively to @RichardScriven, if you really want to use lapply (perhaps because this is a generalization of similar tests): You've swapped the arguments.

You are trying to iterate over a list of model statistics from one run and grab two elements. For example, if:

a <- list(x=1, y=2, z=3)

then lapply(a,[, c('x', 'y')) will unroll first to a[1][ c('x', 'y') ] then a[2][ c('x', 'y') ], and finally a[3][ c('x', 'y') ], not (I think) what you want.1

Two possible ways for you to go.

  1. Try lapply( c('x', 'y'), function(i) a[[i]]) (or sapply, for that matter). This is perhaps a bit awkward, though, since it really has no advantage over a[ c('x', 'y') ] in the first place.

  2. If you are intending on grabbing the model statistics from multiple runs and are simply trying with a single set of statistics, try:

    lapply( list(a), `[`, c('x', 'y') )

    Replace list(a) with modellist which could have been formed with modellist <- list(model1, model2, model3) (or, more appropriately, the return of a different *apply function.

PS:
1 if you try it, it doesn't do exactly that -- the output doesn't match -- but it is conceptually how I envision what is going on.

Upvotes: 0

Rich Scriven
Rich Scriven

Reputation: 99331

There isn't a need for lapply here. str(model1) tells us that model1 is a list of 11 elements with

> names(model1)
#  [1] "call"          "terms"         "residuals"     "coefficients"  
#  [5] "aliased"       "sigma"         "df"            "r.squared" 
#  [9] "adj.r.squared" "fstatistic"    "cov.unscaled" 

The entire list can be viewed with c(model1) The r-squared values can be accessed directly with

> model1[c('r.squared', 'adj.r.squared')]
# $r.squared
# [1] 0.01382265

# $adj.r.squared
# [1] 0.007159294

or with a regular expression to capture both r-squared values

> model1[grepl('squared', names(model1))]
# $r.squared
# [1] 0.01382265

# $adj.r.squared
# [1] 0.007159294

Upvotes: 3

Related Questions