Reputation: 21918
I have been exploring the various application of using pmap
function and its variations recently and I am particularly interested in using c(...)
to pass all the arguments into. The following data set belongs to another question that we discussed earlier today with a number of very knowledgeable users.
We were supposed to repeat the values in weight
column based on values in Days
column along their respective rows to get the following output:
df <- tribble(
~Name, ~School, ~Weight, ~Days,
"Antoine", "Bach", 0.03, 5,
"Antoine", "Ken", 0.02, 7,
"Barbara", "Franklin", 0.04, 3
)
Output:
df %>%
mutate(map2_dfr(Weight, Days, ~ set_names(rep(.x, .y), 1:.y))) %>%
select(-c(Weight, Days))
# A tibble: 3 x 9
Name School `1` `2` `3` `4` `5` `6` `7`
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
My question is this output is achievable through various solutions but the following one proposed by one of the contributors caught my attention. I would like to know how I could rewrite it by means of c(...)
# This is not my code and it works:
pmap_dfr(df, function(Weight, Days, ...) c(..., setNames(rep(Weight, Days), 1:Days)))
# And I can also rewrite it in the following way which also works:
df %>%
mutate(data = pmap(list(Weight, Days), ~ setNames(rep(.x, .y), 1:.y))) %>%
unnest_wider(data)
But I would like to know why any of these doesn't work:
df %>%
mutate(pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days))))
df %>%
pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days)))
Thank you very much in advance and so sorry for the long description.
Upvotes: 3
Views: 465
Reputation: 26218
This may not make much value addition, but may be helpful for understanding things in lambda functions.
pmap_df(df, ~ c(setNames(c(..1, ..2), names(df[1:2])), setNames(rep(..3, ..4), seq_len(..4))))
# A tibble: 3 x 9
Name School `1` `2` `3` `4` `5` `6` `7`
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
pmap_df
only is sufficient and pmap_dfr
may be redundantOr this will also do
pmap_df(df, ~ c(list(...)[1:2], setNames(rep(..3, ..4), seq_len(..4))))
# A tibble: 3 x 9
Name School `1` `2` `3` `4` `5` `6` `7`
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Upvotes: 2
Reputation: 887213
The issue seems to be mixing the custom anonymous/lambda function (function(Weight, Days, ...)
- where the arguments are named as the same as the column name) with the default lambda function (~
- where the arguments are .x
, .y
if only two elements or if more than two - ..1
, ..2
, ..3
etc). In the OP's code
library(dplyr)
library(purrr)
df %>%
mutate(pmap_dfr(., ~ c(..., setNames(rep(Weight, Days), 1:Days))))
The 'Weight', 'Days' returns the full column values from original dataset and not from rows. If we want to still make use of the above command, we need to convert the data captured in each row to a tibble
and use with
df %>%
pmap_dfr(., ~ with(as_tibble(list(...)),
setNames(rep(Weight, Days), seq_len(Days))))
# A tibble: 3 x 7
# `1` `2` `3` `4` `5` `6` `7`
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 0.03 0.03 0.03 0.03 0.03 NA NA
#2 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 0.04 0.04 0.04 NA NA NA NA
If we want the other columns,
df %>%
pmap_dfr(., ~ c(list(...)[-(3:4)], with(as_tibble(list(...)),
setNames(rep(Weight, Days), seq_len(Days)))))
# A tibble: 3 x 9
# Name School `1` `2` `3` `4` `5` `6` `7`
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
#2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Or use rowwise
library(tidyr)
df %>%
rowwise %>%
mutate(out = list(setNames(rep(Weight, Days), seq_len(Days)))) %>%
ungroup %>%
unnest_wider(c(out)) %>%
select(-Weight, -Days)
# A tibble: 3 x 9
# Name School `1` `2` `3` `4` `5` `6` `7`
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 Antoine Bach 0.03 0.03 0.03 0.03 0.03 NA NA
#2 Antoine Ken 0.02 0.02 0.02 0.02 0.02 0.02 0.02
#3 Barbara Franklin 0.04 0.04 0.04 NA NA NA NA
Upvotes: 3