hmhensen
hmhensen

Reputation: 3195

purrr accumulate function strange output

I guess this is a combination of two questions.

I'm trying to understand the purrr::accumulate function but having a hard time figuring out the interplay between .x and .y. I've been reading the purrr vignette but it's not explaining much. I don't have a programming background so a lot of it is over my head.

First thing, why do these have different outputs when they are the same call? The only difference is the () after paste.

accumulate(letters[1:5], paste, sep = ".")
[1] "a"         "a.b"       "a.b.c"     "a.b.c.d"   "a.b.c.d.e"

accumulate(letters[1:5], paste(), sep = ".")
[1] "a" "a" "a" "a" "a"

Second, what is going on here?

2:11 %>% accumulate(~ .x)
[1] 2 2 2 2 2 2 2 2 2 2

accumulate(1:5, `+`)
[1]  1  3  6 10 15

2:11 %>% accumulate(~ .y)
[1]  2  3  4  5  6  7  8  9 10 11

2:11 %>% accumulate(~ .x + .y)
[1]  2  5  9 14 20 27 35 44 54 65

.x is the accumulating value, but I guess it's not accumulating anything? It makes sense for 1:5 as cumsum. .y is the element in the list. Am I correct in interpreting .y as essentially print?

But wouldn't the first output of .x + .y be 4?

Some insight would be very welcome.

Upvotes: 3

Views: 470

Answers (1)

akrun
akrun

Reputation: 886938

It would be easier to understand when there are print statements

2:11 %>% 
   accumulate(~ {
     print("---step----")
     print(paste0(".x: ", .x))
     print(paste0(".y: ", .y))
     print(.x + .y)
    })
#[1] "---step----"
#[1] ".x: 2"  # .init or first value of the vector (as `.init` not specified)
#[1] ".y: 3"  # second value
#[1] 5    # sum of .x + .y
#[1] "---step----"
#[1] ".x: 5"   # note .x gets updated with the sum
#[1] ".y: 4"   # .y gets the next element
#[1] 9
#[1] "---step----"
#[1] ".x: 9"   # similarly in all the below steps
#[1] ".y: 5"
#[1] 14
#[1] "---step----"
#[1] ".x: 14"
#[1] ".y: 6"
#[1] 20
#[1] "---step----"
#[1] ".x: 20"
#[1] ".y: 7"
#[1] 27
#[1] "---step----"
#[1] ".x: 27"
#[1] ".y: 8"
#[1] 35
#[1] "---step----"
#[1] ".x: 35"
#[1] ".y: 9"
#[1] 44
#[1] "---step----"
#[1] ".x: 44"
#[1] ".y: 10"
#[1] 54
#[1] "---step----"
#[1] ".x: 54"
#[1] ".y: 11"
#[1] 65
# [1]  2  5  9 14 20 27 35 44 54 65

Here, .x is the one getting updated in each iteration and that value is getting passed into .x + .y

It is essentially similar to

cumsum(2:11)
#[1]  2  5  9 14 20 27 35 44 54 65

When we only pass .x, it is the .init value i.e the first element which is not getting updated as the .f is not doing anything

2:11 %>% 
    accumulate(~ print(.x))
#[1] 2
#[1] 2
#[1] 2
#[1] 2
#[1] 2
#[1] 2
#[1] 2
#[1] 2
#[1] 2
# [1] 2 2 2 2 2 2 2 2 2 2

Now, we pass .init a different value

2:11 %>%
    accumulate(~ print(.x), .init = 5)
#[1] 5
#[1] 5
#[1] 5
#[1] 5
#[1] 5
#[1] 5
#[1] 5
#[1] 5
#[1] 5
#[1] 5
#[1] 5 5 5 5 5 5 5 5 5 5 5

Also, the difference in the first two calls, is the difference in passing the arguments. In the first case with paste, .x and .y are implicitly passed, while in second paste(), it is only the .x i.e. going in to it

accumulate(letters[1:5], ~paste(.x, .y, sep = "."))
#[1] "a"         "a.b"       "a.b.c"     "a.b.c.d"   "a.b.c.d.e"
accumulate(letters[1:5], ~paste(.x,  sep = "."))
#[1] "a" "a" "a" "a" "a"

Upvotes: 4

Related Questions