hmhensen
hmhensen

Reputation: 3195

How to call columns implicitly in certain R functions

There are some functions in R where you have to call columns explicitly by name, such as pmin. My question is how to get around this, preferably using tidyverse.

Here's some sample data.

library(tidyverse)
df <- tibble(a = c(1:5), 
             b = c(6:10), 
             d = c(11:15), 
             e = c(16:20))

# A tibble: 5 x 4
      a     b     d     e
  <int> <int> <int> <int>
1     1     6    11    16
2     2     7    12    17
3     3     8    13    18
4     4     9    14    19
5     5    10    15    20

Now I'd like to find the minimum of all the columns except for "e". I can do this:

df %>% 
  mutate(min = pmin(a, b, d))

# A tibble: 5 x 5
      a     b     d     e   min
  <int> <int> <int> <int> <int>
1     1     6    11    16     1
2     2     7    12    17     2
3     3     8    13    18     3
4     4     9    14    19     4
5     5    10    15    20     5

But what if I have many columns and would like to call every column except "e" without having to type out each column's name? I've made several attempts but none successful. I used the column index in my examples but I'd prefer excluding "e" by name. See below.

df %>% 
  mutate(min = pmin(-e))

df %>% 
  mutate(min = pmin(names(. %>% select(.))[-4]))

df %>% 
  mutate(min = pmin(names(.)[-4]))

df %>% 
  mutate(min = pmin(noquote(paste0(names(.)[-4], collapse = ","))))

df %>% 
  mutate(min = pmin(!!ensyms(names(.)[-4])))

None of these worked and I'm a bit at a loss.

Upvotes: 3

Views: 128

Answers (3)

akrun
akrun

Reputation: 886928

We can also use reduce with pmin

library(dplyr)
library(purrr)
df %>% 
   mutate(min = select(., -e) %>% 
                   reduce(pmin))
# A tibble: 5 x 5
#      a     b     d     e   min
#  <int> <int> <int> <int> <int>
#1     1     6    11    16     1
#2     2     7    12    17     2
#3     3     8    13    18     3
#4     4     9    14    19     4
#5     5    10    15    20     5

Or with syms and !!!. Note that en- prefix is used while using from inside a function

df %>% 
    mutate(min = pmin(!!! syms(names(.)[-4])))
# A tibble: 5 x 5
#      a     b     d     e   min
#  <int> <int> <int> <int> <int>
#1     1     6    11    16     1
#2     2     7    12    17     2
#3     3     8    13    18     3
#4     4     9    14    19     4
#5     5    10    15    20     5

Upvotes: 3

tmfmnk
tmfmnk

Reputation: 39858

One option using dplyr and purrr could be:

df %>%
 mutate(min = exec(pmin, !!!select(., -e)))

      a     b     d     e   min
  <int> <int> <int> <int> <int>
1     1     6    11    16     1
2     2     7    12    17     2
3     3     8    13    18     3
4     4     9    14    19     4
5     5    10    15    20     5

For those not reading the comments, a nice option proposed by @IceCreamToucan and involving only dplyr could be:

df %>%
 mutate(min = do.call(pmin, select(., -e)))

Upvotes: 3

Giovanni Colitti
Giovanni Colitti

Reputation: 2334

I would do this by reshaping long, calculating the min by group, and then reshaping back to wide:

df %>% 
  rowid_to_column() %>% 
  pivot_longer(cols = -c(e, rowid)) %>% 
  group_by(rowid) %>% 
  mutate(min = min(value)) %>% 
  ungroup() %>% 
  pivot_wider() %>% 
  select(-rowid, -min, min)

Upvotes: 0

Related Questions