Reputation: 639
Here is the source code for the "ave" function in R:
function (x, ..., FUN = mean)
{
if (missing(...))
x[] <- FUN(x)
else {
g <- interaction(...)
split(x, g) <- lapply(split(x, g), FUN)
}
x
}
I am having trouble understanding how the assignment, "split(x, g) <- lapply(split(x, g), FUN)" works. Consider the following example:
# Overview: function inputs and outputs
> x = 10*1:6
> g = c('a', 'b', 'a', 'b', 'a', 'b')
> ave(x, g)
[1] 30 40 30 40 30 40
# Individual components of "split" assignment
> split(x, g)
$a
[1] 10 30 50
$b
[1] 20 40 60
> lapply(split(x, g), mean)
$a
[1] 30
$b
[1] 40
# Examine "x" before and after assignment
> x
[1] 10 20 30 40 50 60
> split(x, g) <- lapply(split(x, g), mean)
> x
[1] 30 40 30 40 30 40
Questions:
• Why does the assignment, "split(x,g) <- lapply(split(x,g), mean)", directly modify x? Does "<-" always modify the first argument of a function, or is there some other rule for this?
• How does this assignment even work? Both the "split" and "lapply" statements have lost the original ordering of x. They are also length 2. How do you end up with a vector of length(x) that matches the original ordering of x?
Upvotes: 1
Views: 532
Reputation: 2753
This is a tricky one. <-
usually does not work in this way. What is actually happening is that you are not calling split()
, you are calling a replacement function called split<-()
. The documentation of split says
[...] The replacement forms replace values corresponding to such a division. unsplit reverses the effect of split.
See also this answer
Upvotes: 5