algoboy
algoboy

Reputation: 63

lapply generating different results when the attributes of the list is called using a $ sign

When I created the list and called the list using a "[ ]" operator I got the following result

x <- list(a=1:5, b=rnorm(5))
lapply(x[1], mean)
$a
[1] 3
lapply(x[2], sum)
$b
[1] 0.3653843

But when I called the same list using $ sign I get a different result

> x <- list(a=1:5, b=rnorm(5))
> lapply(x$a, mean)
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] 4

[[5]]
[1] 5

> lapply(x$b, sum)
[[1]]
[1] 0.7208679

[[2]]
[1] 1.367853

[[3]]
[1] -0.5799428

[[4]]
[1] -2.186257

[[5]]
[1] 0.1597629

Not able to understand why?

Upvotes: 0

Views: 201

Answers (3)

MagicScout
MagicScout

Reputation: 105

The difference is relatively small but very meaningful. A list a is a fancy word for a vector where elements don't have to be of the same type (int, char, logical, etc.). In fact the components of a list can be anything, even other lists.

To use an analogy:

A vector is a box. The only rule that applies to this box is that all the things in the box have be of the same type. Things we put in boxes (such as numbers, or Boolean values) are called elements.

A list is a crate. The only rule for crates is that we can put only boxes and other crates in the crate but not elements. Things we put in crates are called components.

In order to get things from out vectors/boxes or lists/crates we use three functions (everything in R is a function), each meaning something a little different:

  • square brackets [k]. These mean "get the k^th element from the vector", or in the case of the list "get me the k^th component in the list". What's the difference you might ask? Well requesting a element in a vector will get you a value (i.e. TRUE, "john doe", 3) whereas requesting an component from a list can get you only a vector or another list (In terms of the analogy: the only thing you are getting out of a crate is either a box or a crate)
  • double square brackets [[k]]. These mean "get me the contents of the k^th element from the vector", or in the case of the list "get me the contents of the k^th component in the list*. In the case of the vector these double brackets aren't very useful since a vector cannot contain something in turn contains something else. In the box analogy: You're asking for the contents of an element. Since an element has no content R chooses to return the element itself. In the case of the list, R goes to the k^th component in the list and returns its content. using the analogy: R goes to the crate pickes out the k^th box (or crate) in it and returns the content of said box (or crate).
  • dollar symbol $. This is a symbol that is almost exclusively used in the context of lists since it allows calling named components from the list. The main benefit from this symbol is that it allows you to refer to components in a list as if they were variables in the workspace.

In your example two things are going a little wrong:

First, lapply is an apply function that expects to receive a list object as input (even if they don't explicitly say so). You can see this by printing the lapply code out:

lapply
function (X, FUN, ...) 
{
    FUN <- match.fun(FUN)
    if (!is.vector(X) || is.object(X)) 
        X <- as.list(X)
    .Internal(lapply(X, FUN))
}

Notice that regardless of your input R will take it and convert it to a list with the as.list function. This means that the function operates on components and not elements.

In your first input

lapply(x[1], mean)
lapply(x[2], sum)

you are giving the function a component (a box), in the second input

lapply(x$a, mean)
lapply(x$b, sum)

you are giving the function elements. You can see the difference with how R handles the printing of each. x$a prints like a vector, x[1] prints like a list. Once the function receives the elements, it converts them to a list, assuming that each element should be a component in the list, as shown by the following function:

as.list(x$a)

where each component in the new list is a vector with 1 element.

tl;dr: don't confuse components and elements :).

Upvotes: 0

Theja Tulabandhula
Theja Tulabandhula

Reputation: 921

In the first case, the input to lapply is a list with one element equal to c(1:5) or rnorm(5). In the second case, the input to laply is a vector with 5 elements. So the mean function gets each value 1,2,3,4,5 separately (and does nothing in this case but return the same value).

In other words, x[1] give a list of one element

> x <- list(a=1:5, b=rnorm(5))
> str(x[1])
List of 1
$ a: int [1:5] 1 2 3 4 5

Whereas, x$a is equal to x[["a"]] or x[[1]] and gives a vector with 5 elements:

str(x$a)
int [1:5] 1 2 3 4 5

Upvotes: 0

Sven Hohenstein
Sven Hohenstein

Reputation: 81693

There's a major difference between $ and [. While $ returns the list element, [ returns a list containing one element.

> x[1]
$a
[1] 1 2 3 4 5

> x$a
[1] 1 2 3 4 5

An equivalent expression to x$a is x[[1]]. [[ also returns the list element.

> x[[1]]
[1] 1 2 3 4 5

Since both $ and [[ return a single list element, you can't use them to return multiple ones. However, you can use [ to return a list with multiple elements. For example,

> x[1:2]
$a
[1] 1 2 3 4 5

$b
[1]  0.3465471  0.2955350  1.1292449  1.1136643 -0.9798430

Upvotes: 1

Related Questions