phonetagger
phonetagger

Reputation: 7873

How to store a "complex" data structure in R (not "complex numbers")

I need to train, store, and use a list/array/whatever of several ksvm SVM models, which once I get a set of sensor readings, I can call predict() on each of the models in turn. I want to store these models and metadata about tham in some sort of data structure, but I'm not very familiar with R, and getting a handle on its data structures has been a challenge. My familiarity is with C++, C, and C#.

I envision some sort of array or list that contains both the ksvm models as well as the metadata about them. (The metadata is necessary, among other things, for knowing how to select & organize the input data presented to each model when I call predict() on it.)

The data I want to store in this data structure includes the following for each entry of the data structure:

So in tinkering with how to do this, I tried the following....

First I tried what I thought would be really simple & crude, hoping to build on it later if this worked: A (list of (list of different data types))...

> 
> uname = Sys.getenv("USERNAME", unset="UNKNOWN_USER")
> cname = Sys.getenv("COMPUTERNAME", unset="UNKNOWN_COMPUTER")
> trainedAt = paste("Trained at", Sys.time(), "by", uname, "on", cname)
> trainedAt
[1] "Trained at 2015-04-22 20:54:54 by mminich on MMINICH1"
> sensorsToUse = c(12,14,15,16,24,26)
> sensorsToUse
[1] 12 14 15 16 24 26
> trustFactor = 88
> 
> TestModels = list()
> TestModels[1] = list(trainedAt, sensorsToUse, trustFactor)
Warning message:
In TestModels[1] = list(trainedAt, sensorsToUse, trustFactor) :
  number of items to replace is not a multiple of replacement length
> 
> TestModels
[[1]]
[1] "Trained at 2015-04-22 20:54:54 by mminich on MMINICH1"

> 

...wha? What did it think I was trying to replace? I was just trying to populate element 1 of TestModels. Later I would add an element [2], [3], etc... but this didn't work and I don't know why. Maybe I need to define TestModels as a list of lists right up front...

> TestModels = list(list())
> TestModels[1] = list(trainedAt, sensorsToUse, trustFactor)
Warning message:
In TestModels[1] = list(trainedAt, sensorsToUse, trustFactor) :
  number of items to replace is not a multiple of replacement length
> 

Hmm. That no workie either. Let's try something else...

> TestModels = list(list())
> TestModels[1][1] = list(trainedAt, sensorsToUse, trustFactor)
Warning message:
In TestModels[1][1] = list(trainedAt, sensorsToUse, trustFactor) :
  number of items to replace is not a multiple of replacement length
> 

Drat. Still no workie.

Please clue me in on how I can do this. And I'd really like to be able to access the fields of my data structure by name, perhaps something along the lines of...

> print(TestModels[1]["TrainedAt"])

Thank you very much!

Upvotes: 1

Views: 231

Answers (1)

Molx
Molx

Reputation: 6931

You were very close. To avoid the warning, you shouldn't use

TestModels[1] = list(trainedAt, sensorsToUse, trustFactor)

but instead

TestModels[[1]] = list(trainedAt, sensorsToUse, trustFactor)

To access a list element you use [[ ]]. Using [ ] on a list will return a list containing the elements inside the single brackets. The warning is shown because you were replacing a list containing one element (because this is how you created it) with a list containing 3 elements. This wouldn't be a problem for other elements:

TestModels[2] = list(trainedAt, sensorsToUse, trustFactor) # This element did not exist, so no replacement warning

To understand list subsetting better, take a look at this:

item1 <- list("a", 1:10, c(T, F, T))
item2 <- list("b", 11:20, c(F, F, F))
mylist <- list(item1=item1, item2=item2)

mylist[1] #This returns a list containing the item 1.
#$item1 #Note the item name of the container list
#$item1[[1]]
#[1] "a"
#
#$item1[[2]]
# [1]  1  2  3  4  5  6  7  8  9 10
#
#$item1[[3]]
#[1]  TRUE FALSE  TRUE
#

mylist[[1]] #This returns item1
#[[1]] #Note this is the same as item1
#[1] "a"
#
#[[2]]
# [1]  1  2  3  4  5  6  7  8  9 10
#
#[[3]]
#[1]  TRUE FALSE  TRUE

To access the list items by name, just name them when creating the list:

mylist <- list(var1 = "a", var2 = 1:10, var3 = c(T, F, T))
mylist$var1 #Or mylist[["var1"]]
# [1] "a"

You can nest this operators like you suggested. So you coud use

containerlist <- list(mylist)
containerlist[[1]]$var1
#[1] "a"

Upvotes: 3

Related Questions