fabha
fabha

Reputation: 111

R: get the best N values of all list subsets

I have the summaries of many linear models in a list called "listlmsummary".

listlmsummary <- lapply(listlm, summary)
listlmsummary

The output of listlmsummary looks like this (quite shortened):

$a
Residual standard error: 3835 on 1921 degrees of freedom
  (50 observations deleted due to missingness)
Multiple R-squared:   0.11, Adjusted R-squared:  0.1063 
F-statistic: 29.68 on 8 and 1921 DF,  p-value: < 2.2e-16

$b
Residual standard error: 3843 on 1898 degrees of freedom
  (68 observations deleted due to missingness)
Multiple R-squared:  0.1125,    Adjusted R-squared:  0.1065 
F-statistic: 18.51 on 13 and 1898 DF,  p-value: < 2.2e-16

$c
Residual standard error: 3760 on 1881 degrees of freedom
  (87 observations deleted due to missingness)
Multiple R-squared:  0.1221,    Adjusted R-squared:  0.117 
F-statistic: 23.79 on 11 and 1881 DF,  p-value: < 2.2e-16

$d
Residual standard error: 3826 on 1907 degrees of freedom
  (60 observations deleted due to missingness)
Multiple R-squared:  0.115, Adjusted R-squared:  0.1094 
F-statistic: 20.64 on 12 and 1907 DF,  p-value: < 2.2e-16

I want to extract the highest N (e.g. 2) Adjusted R-squared values to find the best model, and that it also tells me what list element this Adj.R-sqr value comes from. Does anyone have an idea how to do this?

I know that I can get a single R-squared value with this call:

listlmsummary[["a"]]$adj.r.squared

But extracting all R-squared values with something like this listlmsummary[[]]$adj.r.squared or listlmsummary[[c("a", "b", "c", "d")]]$adj.r.squaredand then ordering the output does not work.

Thank you for any help! :)

Upvotes: 0

Views: 80

Answers (3)

Nicol&#225;s Velasquez
Nicol&#225;s Velasquez

Reputation: 5898

A quick and dirty way to do it might be:

Maxr2sq <- max(unlist(sapply (listlm, "[", i = "adj.r.squared")))
Position <- which(unlist(sapply (listlm, "[", i = "adj.r.squared")) == Maxr2sq)
Maxr2sq
Position

However, you might benefit of storing all the results in a data.frame for future reference. For instance, it is theoretically possible that more than one Adj.R2 get the same value. Additionally, it is convenient to store the regression's Call (i.e. formula).

In that case, you could run:

library(tidyverse)

AR2 <- sapply (listlm, "[", i = "adj.r.squared") %>%
       stack() %>% 
       select(values) %>% 
       rename(Adj.R.sqr = values)
Call <- as.character(sapply (listlm, "[", i = "call"))
Position <- setNames(data.frame(seq(1:length(listlm))), c("Position"))
DF <- as_data_frame(cbind(AR2,Call,Position))
DF

Upvotes: 1

akrun
akrun

Reputation: 887241

We can use sapply to extrat the adj.r.squared into a vector and order in decreasingly. Then get the head of 'n' elements from the ordered 'listlmsummary'

i1 <- order(-sapply(listlmsummary, `[[`, "adj.r.squared"))
head(listlmsummary[i1], n)

NOTE: This was answered with the logic and the complete solution requested by the user

Upvotes: 4

Aaron - mostly inactive
Aaron - mostly inactive

Reputation: 37764

sapply(listlmsummary, function(x) x$adj.r.squared)

Also see the new broom package.

Upvotes: 3

Related Questions