Patrick Balada
Patrick Balada

Reputation: 1450

How to apply a function to a data frame for multiple inputs and create columns with the outputs using dplyr?

Given the following data

data_in <- data.frame(X1 = c(1, 3, 5, 2, 6), 
       X2 = c(2, 4, 5, 1, 8),
       X3 = c(3, 2, 4, 1, 4))

I wrote a function, which takes the data frame, a value (here called distance) and a string (to add a column name) to count the number of values being smaller or equal to the input value.

custom_function <- function(some_data_frame, distance, name) {
   some_data_frame %>% 
   mutate(!!name := rowSums(. <= distance, na.rm = TRUE)) %>% 
   return()
}

I can apply the function to the data as follows:

data_in %>% 
 custom_function(., 5, "some_name")

What I would like now is to use a vector of distances and create a column for each distance using my custom function. Let's say for c(1, 3, 5), I would like to get three columns in an automatic manner and not in hardcoding (applying the function manually three times).

Upvotes: 0

Views: 57

Answers (2)

utubun
utubun

Reputation: 4520

There is an easy way to do that with mapply (using the same distances as in @Sotos ansswer):

(dst <- c(5, 3, 1, 6, 7, 8))
# [1] 5 3 1 6 7 8

(cnm <- paste('some_name', dst, sep = '_'))
# [1] "some_name_5" "some_name_3" "some_name_1" "some_name_6" "some_name_7" "some_name_8"

data_in[, cnm] <- mapply(function(d) rowSums(data_in <= d, na.rm = T), d = dst)

data_in
#   X1 X2 X3 some_name_5 some_name_3 some_name_1 some_name_6 some_name_7 some_name_8
# 1  1  2  3           3           3           1           3           3           3
# 2  3  4  2           3           2           0           3           3           3
# 3  5  5  4           3           0           0           3           3           3
# 4  2  1  1           3           3           2           3           3           3
# 5  6  8  4           1           0           0           2           2           3

You can obtain the same results within tidyverse using purrr::map2:

cbind(
  data_in,
  purrr::map2(dst, cnm, ~custom_function(data_in, .x, .y))
)

#   X1 X2 X3 some_name_5 some_name_3 some_name_1 some_name_6 some_name_7 some_name_8
# 1  1  2  3           3           3           1           3           3           3
# 2  3  4  2           3           2           0           3           3           3
# 3  5  5  4           3           0           0           3           3           3
# 4  2  1  1           3           3           2           3           3           3
# 5  6  8  4           1           0           0           2           2           3

With custom_function() defined as:

custom_function <- function(some_data_frame, distance, name) {
  some_data_frame %>% 
    transmute(!!name := rowSums(. <= distance, na.rm = TRUE))
}

Upvotes: 1

Sotos
Sotos

Reputation: 51582

You can use sapply to loop through your vector and cbind at the end, i.e.

cbind.data.frame(data_in, 
                 do.call(cbind.data.frame, sapply(c(5, 3, 1, 6, 7, 8), function(i) 
                   custom_function(data_in, i, paste0('some_name_', i))[ncol(data_in) + 1])))

which gives,

  X1 X2 X3 some_name_5 some_name_3 some_name_1 some_name_6 some_name_7 some_name_8
1  1  2  3           3           3           1           3           3           3
2  3  4  2           3           2           0           3           3           3
3  5  5  4           3           0           0           3           3           3
4  2  1  1           3           3           2           3           3           3
5  6  8  4           1           0           0           2           2           3

Upvotes: 0

Related Questions