John
John

Reputation: 476

What's a good way to store multiple models in an R data structure?

The R programming allows one to define linear models(e.g. lm) and assign them to a variable.

f1 <-  lm(sonarData$V61~., sonarData )

Here is the object.

> f1 

Call:
lm(formula = sonarData$V61 ~ ., data = sonarData)

Coefficients:
(Intercept)           V1           V2           V3           V4           V5  
    1.29803     -4.20006     -4.34196     12.28351     -6.90441      0.02196  
         V6           V7           V8           V9          V10          V11  
   -1.56658      2.55997      1.43647     -1.61434      0.72176     -0.56495  
        V12          V13          V14          V15          V16          V17  
   -1.84479     -1.18588      0.81750     -0.71591      0.70529      1.30928  
        V18          V19          V20          V21          V22          V23  
   -1.83044      1.20194     -1.03330      1.07206     -1.01695      0.90517  
        V24          V25          V26          V27          V28          V29  
   -2.38743      1.56199     -0.07457     -0.82562      0.34103      1.24376  
        V30          V31          V32          V33          V34          V35  
   -3.34592      3.91707     -1.69221     -0.01213      1.42545     -2.33689  
        V36          V37          V38          V39          V40          V41  
    2.15805      0.23958     -0.19744     -1.29952      1.47017     -0.71109  
        V42          V43          V44          V45          V46          V47  
    0.64684     -0.17931     -0.41240     -0.41779      0.34729     -2.54380  
        V48          V49          V50          V51          V52          V53  
   -0.77669     -9.83263     20.01045      4.64701     -7.00754    -10.62042  
        V54          V55          V56          V57          V58          V59  
  -12.95094     23.36374      8.29332      3.12945    -16.89160    -13.58556  
        V60  
    6.55849  

A natural extension is to store multiple models. The R manual says, "Lists have elements, each of which can contain any type of R object". Unfortunately, the list data structure generates an error when I try this.

> aa <- list(type=any)
> aa[1] <- f1
Warning message:
In aa[1] <- f1 :
  number of items to replace is not a multiple of replacement length

Looks like only the coefficients have been stored.

> aa[1]
$type
 (Intercept)           V1           V2           V3           V4           V5 
  1.29802569  -4.20006167  -4.34195592  12.28351356  -6.90440693   0.02196316 
          V6           V7           V8           V9          V10          V11 
 -1.56658231   2.55997324   1.43647370  -1.61433971   0.72175967  -0.56495060 
         V12          V13          V14          V15          V16          V17 
 -1.84479102  -1.18587951   0.81750259  -0.71591460   0.70529370   1.30927725 
         V18          V19          V20          V21          V22          V23 
 -1.83043902   1.20193627  -1.03330328   1.07205668  -1.01695304   0.90516589 
         V24          V25          V26          V27          V28          V29 
 -2.38742679   1.56198758  -0.07456730  -0.82562068   0.34102565   1.24376201 
         V30          V31          V32          V33          V34          V35 
 -3.34592095   3.91707289  -1.69221059  -0.01213418   1.42545025  -2.33689151 
         V36          V37          V38          V39          V40          V41 
  2.15804936   0.23957839  -0.19743977  -1.29952333   1.47016998  -0.71108772 
         V42          V43          V44          V45          V46          V47 
  0.64683517  -0.17930867  -0.41239596  -0.41779109   0.34728768  -2.54380227 
         V48          V49          V50          V51          V52          V53 
 -0.77669122  -9.83262844  20.01045111   4.64701358  -7.00754193 -10.62042062 
         V54          V55          V56          V57          V58          V59 
-12.95093654  23.36374272   8.29331935   3.12944794 -16.89160356 -13.58556456 
         V60 
  6.55848543 

I wasn't expecting that. Can you help me understand why? What's the right way to use a data structure to store multiple models?

Thanks for your time and consideration.

Sincerely, John

Upvotes: 0

Views: 7414

Answers (2)

A better solution is to save the full model. This is a good option with models that take a long time to fit or if you want to retrieve them in the future.

#To save it
saveRDS(f1, "f1.rds")

#To retrieve it
f1 <- readRDS("~/f1.rds")

Upvotes: 1

joran
joran

Reputation: 173627

The code list(type = any) creates a list of length one with one element, the function any with the name type.

To create an empty list, try list(). Even better, try vector("list",5) to create an empty list of length 5. (That way you won't be growing your list as you add elements to it, which can be very, very slow.)

And Aaron's comment reminded me to point out that to assign something as an element of a list (rather than as a sub-list) you should use [[ rather than [.

Upvotes: 8

Related Questions