tyler.reed
tyler.reed

Reputation: 31

mutate across with vectorized function parameters

I know the "across" paradigm is "many columns, one function" so this might not be possible. The idea is that I want to apply the same function to several columns with a parameter varying based on the column.

I got this to work using cur_column() but it basically amounts to computing the parameters 1 by 1 rather than providing a vector of equal size to the number of columns containing the parameters.

This first block produces what I want but it I'm wondering if there's a cleaner way.

library(dplyr)

df = data.frame(column1 = 1:100, column2 = 1:100)

parameters = data.frame(
    column_names = c('column1','column2'),
    parameters = c(10,100))
    
custom_function = function(x,addend){
    x + addend
}

df2 = df %>% mutate(
    across(x$column_names,
           ~custom_function(.,addend = x %>%
                                        filter(column_names == cur_column()) %>% 
                                        pull(parameters))))

What I would like to do for the last line would look like

df2 = df %>% mutate(
    across(x$column_names,~custom_function(.,addend = x$parameters)))

Upvotes: 3

Views: 243

Answers (4)

akrun
akrun

Reputation: 887971

We could either use match on the column name (cur_column()) with the column_names column of 'parameters' and extract the 'parameters' column value to be used as input in custom_function

library(dplyr)
df %>%
   mutate(across(all_of(parameters$column_names), 
     ~ custom_function(.x, parameters$parameters[match(cur_column(), 
      parameters$column_names)])))

-output

    column1 column2
1        11     101
2        12     102
3        13     103
4        14     104
5        15     105
6        16     106
7        17     107
8        18     108
...

Or convert the two column data.frame to a named vector (deframe) and directly extract the value from the name

library(tibble)
params <- deframe(parameters)
df %>%
  mutate(across(all_of(names(params)),
   ~ custom_function(.x, params[cur_column()])))

-output

  column1 column2
1      11     101
2      12     102
3      13     103
4      14     104
5      15     105
6      16     106
...

Upvotes: 1

bischrob
bischrob

Reputation: 584

Edit (this doesn't work on the second column). I don't think across is what you want here. One of the other answers is better. You could also just add the column to your function like this:

library(dplyr)

df = data.frame(column1 = 1:100, column2 = 1:100)

parameters = data.frame(
  column_names = c('column1','column2'),
  parameters = c(10,100))

custom_function = function(x,column, parameters){
  addend = parameters %>%
    filter(column_names == column) %>% 
    pull(parameters)
  x + addend
}

df2 = df %>% mutate(
  across(parameters$column_names,custom_function,cur_column(),parameters))

Upvotes: 0

M--
M--

Reputation: 29238

We can do this in base with mapply:

mapply(`+`, df[,parameters$column_names], parameters$parameters)

##>      column1 column2
##> [1,]      11     101
##> [2,]      12     102
##> [3,]      13     103
##> [4,]      14     104
##> [5,]      15     105
##> ...

Upvotes: 3

Allan Cameron
Allan Cameron

Reputation: 174586

I think a mapping function operating on the parameters would easier than across on the data:

library(tidyverse)

with(parameters, as_tibble(map2(df[column_names], parameters, custom_function)))
#> # A tibble: 100 x 2
#>    column1 column2
#>      <dbl>   <dbl>
#>  1      11     101
#>  2      12     102
#>  3      13     103
#>  4      14     104
#>  5      15     105
#>  6      16     106
#>  7      17     107
#>  8      18     108
#>  9      19     109
#> 10      20     110
#> # ... with 90 more rows

Created on 2022-12-15 with reprex v2.0.2

Upvotes: 1

Related Questions