Jake Thompson
Jake Thompson

Reputation: 2843

Arrange data frame by all columns using dplyr

I am generating data frames of 1s and 0s as follows:

library(tidyverse)
library(glue)

num_var <- 3

rep(list(c(0L, 1L)), num_var) %>%
  set_names(glue("var_{seq_len(num_var)}")) %>%
  expand.grid() %>%
  mutate(total = rowSums(.)) %>%
  select(total, everything()) %>%
  arrange(total, desc(var_1, var_2, var_3))

#>   total var_1 var_2 var_3
#> 1     0     0     0     0
#> 2     1     1     0     0
#> 3     1     0     1     0
#> 4     1     0     0     1
#> 5     2     1     1     0
#> 6     2     1     0     1
#> 7     2     0     1     1
#> 8     3     1     1     1

Created on 2018-01-08 by the reprex package (v0.1.1.9000).

I would need to arrange by the total sum of the variable in ascending order, and then each variable in descending order. This is fairly straightforward using dplyr::arrange(). However, I would like to have a more robust method of arranging. For example, if num_var is changed to, then, the final line must also be changed to arrange(total, desc(var_1, var_2, var_3, var_4)). I have tried using the tidy selector everything() to arrange as I do with the select() function, but this errors:

library(tidyverse)
library(glue)

num_var <- 3

rep(list(c(0L, 1L)), num_var) %>%
  set_names(glue("var_{seq_len(num_var)}")) %>%
  expand.grid() %>%
  mutate(total = rowSums(.)) %>%
  select(total, everything()) %>%
  arrange(total, desc(everything()))

#> Error in arrange_impl(.data, dots): Evaluation error: No tidyselect variables were registered.

Created on 2018-01-08 by the reprex package (v0.1.1.9000).

Is there a way to select variables for arranging without naming them all directly?

Upvotes: 5

Views: 4131

Answers (3)

Matifou
Matifou

Reputation: 8880

For newer versions of dplyr, you can now use across:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

iris %>% 
  arrange(across(everything(), desc)) %>% 
  head()
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
#> 1          7.9         3.8          6.4         2.0 virginica
#> 2          7.7         3.8          6.7         2.2 virginica
#> 3          7.7         3.0          6.1         2.3 virginica
#> 4          7.7         2.8          6.7         2.0 virginica
#> 5          7.7         2.6          6.9         2.3 virginica
#> 6          7.6         3.0          6.6         2.1 virginica


all.equal(iris %>% 
            arrange(across(everything(), desc)) ,
          iris %>% 
            arrange(desc(Sepal.Length), desc(Sepal.Width), desc(Petal.Length), desc(Petal.Width), desc(Species)))
#> [1] TRUE

Created on 2022-02-07 by the reprex package (v2.0.1)

Upvotes: 10

Voy
Voy

Reputation: 99

could arrange by every column going left to right using this code

library(magrittr) ; library(rlang) ; library(dplyr)
data %>% arrange(!!!syms(colnames(.)))

this works since arrange doesnt accept tidyselect syntax and thus must be passed symbols (or maybe also strings) for each of the names

Upvotes: 1

akuiper
akuiper

Reputation: 214957

arrange doesn't seem to work with select helper functions directly. You may use arrange_at, total in ascending order, and other variables except total (select using -one_of("total")) in descending order:

arrange_at(vars(total, desc(-one_of("total"))))

#  total var_1 var_2 var_3
#1     0     0     0     0
#2     1     1     0     0
#3     1     0     1     0
#4     1     0     0     1
#5     2     1     1     0
#6     2     1     0     1
#7     2     0     1     1
#8     3     1     1     1

Upvotes: 2

Related Questions