Access to the new column name from the function inside across

Question

Is it possible to get the name of the new column from the across function?

For example

data.frame(a=1) %>% 
mutate(across(c(b=a, c=a), function(i) if("new column name"=="b") a+1 else a+0.5))

Expected result:

#>   a b   c
#> 1 1 2 1.5

^{Created on 2021-12-09 by the reprex package (v2.0.0)}

I attempted to use cur_column() but its return value appears to be a in this case.

I apologise that this example is too simple to achieve the desired result in other ways, but my actual code is a large nested dataframe that is difficult to provide.

caldwellst · Accepted Answer

Interesting question. It seems that because you are defining b and c within the across call, they aren't available through cur_column().

data.frame(a=1) %>%
  mutate(across(c(b=a, c=a), function(i) print(cur_column())))
#> [1] "a"
#> [1] "a"

It works fine if they are already defined in the data.frame. Using tibble() here so I can refer to a in the constructor.

tibble(a=1, b=a, c=a) %>%
  mutate(across(-a, ~ if (cur_column() == "b") .x + 1 else .x + 0.5))
#> # A tibble: 1 × 3
#>       a     b     c
#>     
#> 1     1     2   1.5

And similarly works even if you are in the same mutate() call, just making sure to define b and c prior to across().

data.frame(a=1) %>%
  mutate(across(c(b=a, c=a), ~.x),
         across(-a, ~ if (cur_column() == "b") .x + 1 else .x + 0.5))
#>   a b   c
#> 1 1 2 1.5

I believe it's happening because across is working across the rhs (a) and assigning the values to the lhs (b) only after the function in across() has been applied. I'm not sure this is the expected behavior (although it does seem right), I actually don't know, so will open up a GitHub issue because I think it's an interesting example!

Access to the new column name from the function inside across

Answers (1)

Related Questions