bill999
bill999

Reputation: 2560

Create column of data frames based on function

I would like to use the map function with the tidyverse to create a column of data frames based on arguments from some, but not all, of the columns of the original data frame/tibble.

I would prefer to be able to use the map function so that I can replace this with future_map to utilize parallel computing.

With the exception of this solution not using map, this solution produces the correct end result (see also this question and answer: How to use rowwise to create a list column based on a function):

library(tidyverse)
library(purrr)

df <- data.frame(a= c(1,2,3), b=c(2,3,4), c=c(6,5,8))

fun <- function(q,y) {
    r <- data.frame(col1 = c(q+y, q, q, y), col2 = c(q,q,q,y))
    r
}

result1 <- df %>% rowwise(a) %>% mutate(list1 = list(fun(a, b)))

> result1
# A tibble: 3 × 4
# Rowwise:  a
      a     b     c list1       
  <dbl> <dbl> <dbl> <list>      
1     1     2     6 <df [4 × 2]>
2     2     3     5 <df [4 × 2]>
3     3     4     8 <df [4 × 2]>

How can I instead do this with map? Here are three incorrect attempts:

Incorrect attempt 1:

wrong1 <- df %>% mutate(list1 = map(list(a,b), fun))

Incorrect attempt 2:

wrong2 <- df %>% mutate(list1 = map(c(a,b), fun))

Incorrect attempt 2:

wrong3 <- df %>% mutate(list1 = list(map(list(a,b), fun)))

The error I get is x argument "y" is missing, with no default. And I am not sure how to pass multiple arguments into a situation like this.

I would like a solution with multiple arguments, but if that is not possible, let's move to a function with one argument.

fun_one_arg <- function(q) {
    r <- data.frame(col1 = c(q, q, q, q+q), col2 = c(3*q,q,q,q/2))
    r
}

wrong4 <- df %>% mutate(list1 = map(a, fun_one_arg))
wrong5 <- df %>% mutate(list1 = list(map(a, fun_one_arg)))

These run, but the fourth columns are not data frames, as I would have expected.

Upvotes: 0

Views: 58

Answers (1)

akrun
akrun

Reputation: 887961

We can use map2 as there are two arguments

library(dplyr)
df %>%
    mutate(list1 = map2(a, b, fun)) %>%
    as_tibble
# A tibble: 3 x 4
      a     b     c list1       
  <dbl> <dbl> <dbl> <list>      
1     1     2     6 <df [4 × 2]>
2     2     3     5 <df [4 × 2]>
3     3     4     8 <df [4 × 2]>

Or another option is pmap which can take more than 2 columns as well. The ..1, ..2 represents the columns in the same order

df %>%
    mutate(list1 = pmap(across(c(a, b)), ~ fun(..1, ..2))) %>%
    as_tibble

Upvotes: 1

Related Questions