WhoAmI
WhoAmI

Reputation: 148

R - Multi-level list indexing

What is the convention to assign an object to a multi-level list?

Sofar I thought the convention 1,2 of indexing is to use [[]] instead of $.

Hence, when saving results in loops I usually used the following approach:

> result <- matrix(2,2,2)

> result_list <- list()
> result_list[["A"]][["B"]][["C"]] <- result
> print(result_list)
$A
$A$B
$A$B$C
     [,1] [,2]
[1,]    2    2
[2,]    2    2

Which works as intended with this matrix. But when assigning a single number the list seems to skip the last level.

> result <- 2
> result_list <- list()
> result_list[["A"]][["B"]][["C"]] <- result
> print(result_list)
$A
B 
2 

At the same time, if I use $ instead of [[]] the list again is as intendet.

> result_list$A$B$C <- result
> print(result_list)
$A
$A$B
$A$B$C
[1] 2

As mentioned here you can also use list("A" = list("B" = list("C" = 2))).

Which of these methods should be used for indexing a multi-level list in R?

Upvotes: 3

Views: 1467

Answers (1)

Dominic van Essen
Dominic van Essen

Reputation: 872

Although the title of the question referst to multi-level list indexing, and the syntax mylist[['a']][['b']][['c']] is the same that one would use to retrieve an element of a multi-level list, the differences that you're observing actually arise from using the same syntax for creation (or not) of multi-level lists.

To show this, we can first explicitly create the multi-level (nested) lists, and then check that the indexing works as expected both for matrices and for single numbers.

mymatrix=matrix(1:4,nrow=2)
list_b=list(c=mymatrix)
list_a=list(b=list_b)
mynestedlist1=list(a=list_a)
str( mynestedlist1 )
# List of 1
#  $ a:List of 1
#   ..$ b:List of 1
#   .. ..$ c: int [1:2, 1:2] 1 2 3 4

mynumber=2
list_e=list(f=mynumber)
list_d=list(e=list_e)
mynestedlist2=list(d=list_d)
str( mynestedlist2 )
# List of 1
#  $ d:List of 1
#   ..$ e:List of 1
#   .. ..$ f: num 2

( Note that I've created the lists in sequential steps for clarity; the could have been all rolled-together in a single line, like: mynestedlist2=list(d=list(e=list(f=mynumber))) )

Anyway, now we'll check that indexing works Ok:

str(mynestedlist1[['a']][['b']][['c']])
# int [1:2, 1:2] 1 2 3 4
str(mynestedlist1$a$b$c)
# int [1:2, 1:2] 1 2 3 4

str(mynestedlist2[['d']][['e']][['f']])
# num 2
str(mynestedlist2$d$e$f)
# num 2

# and, just to check that we don't 'skip the last level':
str(mynestedlist2[['d']][['e']])
# List of 1
#  $ f: num 2

So the direct answer to the question 'which of these methods should be used for indexing a multi-level list in R' is: 'any of them - they're all ok'.

So what's going on with the examples in the question, then?

Here, the same syntax is being used to try to implicitly create lists, and since the structure of the nested list is not specified explicitly, this relies on whether R can infer the structure that you want.

In the first and third examples, there's no ambiguity, but each for a different reason:

First example:

mynestedlist1=list()
mynestedlist1[['a']][['b']][['c']]=mymatrix

We've specified that mynestedlist1 is a list. But its elements could be any kind of object, until we assign them. In this case, we put into the element named 'a' an object with an element 'b' that contains an object with an element 'c' that is a matrix. Since there's no R object that can contain a matrix in a single element except a list, the only way to achieve this assignment is by creating a nested list.

Third example:

mynestedlist3=list()
mynestedlist3$g$h$i=mynumber

In this case, we've used the $ notation, which only applies to lists (or to data types that are similar/equivalent to lists, like dataframes). So, again, the only way to follow the instructions of this assignment is by creating a nested list.

Finally, the pesky second example, but starting with a simpler variant of it:

mylist2=list()
mylist2[['c']][['d']]=mynumber

Here there's an ambiguity. We've specified that mylist2 is a list, and we've put into the element named 'c' an object with an element 'd' that contains a single number. This element could have been a list, but it can also be a simple vector, and in this case R chooses this as the simpler option:

str(mylist2)
# List of 1
#  $ c: Named num 2
#   ..- attr(*, "names")= chr "d"

Contrast this to the behaviour when trying to assign a matrix using exactly the same syntax: in this case, the only way follow the syntax would be by creating another, nested, list inside the first one.

What about the full second example mylist2[['c']][['d']][['e']]=mynumber, where we try to assign a number named 'e' to the just-created but still-empty object 'd'?
This seems rather unclear, and this may be the reason for the different behaviours of different versions of R (as reported in the comments to the question). In the question, the action taken by R has been to assign the number while dropping its name, similarly to:

myvec=vector(); myvec2=vector()
myvec[['a']]=1
myvec2[['b']]=2
myvec[['a']]=myvec2
str(myvec)
#  Named num 2
#  - attr(*, "names")= chr "a"

However, the syntax alone doesn't seem to force this behaviour, so it would be sensible to avoid relying on this behaviour when trying to create nested lists, or lists of vectors.

Upvotes: 2

Related Questions