Reputation: 5204
I'm encountering an error with dplyr::bind_rows()
which I don't understand. I want to split my data based on some condition (e.g. a == 1
), operate on one part (e.g. b = b * 10
), and bind it back to the other part using dplyr::bind_rows()
in a single pipe chain. It works fine if I provide the first input to the two parts explictly, but if instead I pipe them in with .
it complains about the data type of agrument 2.
Here's a MRE of the issue:
library(tidyverse)
# sim data
d <- tibble(a = 1:4, b = 1:4)
# works when 'd' is supplied directly to bind_rows()
bind_rows(d %>% filter(a == 1),
d %>% filter(!a == 1) %>% mutate(b = b * 10))
#> # A tibble: 4 x 2
#> a b
#> <int> <dbl>
#> 1 1 1
#> 2 2 20
#> 3 3 30
#> 4 4 40
# fails when 'd' is piped in to bind_rows()
d %>%
bind_rows(. %>% filter(a == 1),
. %>% filter(!a == 1) %>% mutate(b = b * 10))
#> Error: Argument 2 must be a data frame or a named atomic vector.
If I capture what the bind_rows()
call is getting as input as a list()
instead, I can see that two unexpected (to me) things are happening.
.
) is invisibly being provided in addition to the two explict arguments, so I get 3 items instead of 2 in the list.# capture intermediate values for diagnostics
d %>%
list(. %>% filter(a == 1),
. %>% filter(!a == 1) %>% mutate(b = b * 10))
#> [[1]]
#> # A tibble: 4 x 2
#> a b
#> <int> <int>
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 4 4 4
#>
#> [[2]]
#> Functional sequence with the following components:
#>
#> 1. filter(., a == 1)
#>
#> Use 'functions' to extract the individual functions.
#>
#> [[3]]
#> Functional sequence with the following components:
#>
#> 1. filter(., !a == 1)
#> 2. mutate(., b = b * 10)
#>
#> Use 'functions' to extract the individual functions.
This leads me to the following inelegant solution where I solve the first problem by piping to the inner function which seems to force evaluation correctly (for reasons I don't understand) and then solve the second problem by subsetting the list
prior to performing the bind_rows()
operation.
# hack solution to force eval and clean duplicated input
d %>%
list(filter(., a == 1),
filter(., !a == 1) %>% mutate(b = b * 10)) %>%
.[-1] %>%
bind_rows()
#> # A tibble: 4 x 2
#> a b
#> <int> <dbl>
#> 1 1 1
#> 2 2 20
#> 3 3 30
#> 4 4 40
Created on 2022-01-24 by the reprex package (v2.0.1)
It seems like it might be related to this issue, but I can't quite see how. It would be great to understand why this is happening and find a way code this without the need to assign intermediate variables or do this weird hack to subset the intermediate list.
Knowing this was related to curly braces ({}
) enabled me to find a few more helpful links:
1, 2, 3
Upvotes: 2
Views: 657
Reputation: 887048
If we want to use .
, then block it with scope operator ({}
)
library(dplyr)
d %>%
{
bind_rows({.} %>% filter(a == 1),
{.} %>% filter(!a == 1) %>% mutate(b = b * 10))
}
-output
# A tibble: 4 × 2
a b
<int> <dbl>
1 1 1
2 2 20
3 3 30
4 4 40
Upvotes: 4