JD Long
JD Long

Reputation: 60756

passing a sort direction to `arrange` in a `dplyr` data pipeline function

I have a function that does a bunch of things with my data. But I want to add a sort order parameter allowing me to flip the data in one step in the opposite direction when a parameter is passed to the function. And I need the function to be compatible with SQL backends for dbplyr.

My current solution seems really inelegant. I have two complete data pipelines, one with a desc() and one without. This feels really clunky, but since I have to wrap desc() around my field in dplyr I can't think how else to do this. One idea might be to create a sort parameter that is either 1 or -1 and multiply that times my field before sorting. Is there an easier or simpler way to do this?

Here's a simple toy example showing how I'm creating two pipelines:

library(dplyr)

df <- data.frame(x = rnorm(10))

stupid_func <- function(df, sort_order = 'asc'){
  ## does many things in reality, this is a toy example

  if (sort_order == 'asc') {
    df %>% arrange(x) %>% return
  } else if (sort_order == 'desc') {
    df %>% arrange(desc(x)) %>% return
  }

}

stupid_func(df, 'desc')
#>             x
#> 1   1.6680607
#> 2   1.4853252
#> 3   1.1468913
#> 4   1.0447893
#> 5   0.5243115
#> 6   0.3784285
#> 7  -0.5693750
#> 8  -0.8744429
#> 9  -1.0346144
#> 10 -2.6256735

stupid_func(df)
#>             x
#> 1  -2.6256735
#> 2  -1.0346144
#> 3  -0.8744429
#> 4  -0.5693750
#> 5   0.3784285
#> 6   0.5243115
#> 7   1.0447893
#> 8   1.1468913
#> 9   1.4853252
#> 10  1.6680607

And here's using a sort parameter that gets mapped to a fac that is either 1 or -1

stupid_func2 <- function(df, sort_order = 'asc'){
  ## does many things in reality

  if (sort_order == 'asc') {
    fac <- 1
  } else {
    fac <- -1
  }

  df %>% arrange(fac * x) %>% return

}

Upvotes: 2

Views: 1056

Answers (2)

alistaire
alistaire

Reputation: 43364

To avoid control flow entirely, you can pass either desc or identity as function instead of a string and call it:

library(dplyr)
set.seed(47)

df <- data.frame(x = rnorm(2))

f <- function(data, sort_fun = identity){
    arrange(data, sort_fun(x))
}

f(df)
#>           x
#> 1 0.7111425
#> 2 1.9946963

f(df, desc)
#>           x
#> 1 1.9946963
#> 2 0.7111425

If you really want to input strings, you could use them to look up the appropriate function, which could be called the same way:

f2 <- function(data, sort_order = c('asc', 'desc')){
    sort_order <- match.arg(sort_order)
    sort_fun <- list(asc = identity, desc = desc)[[sort_order]]
    arrange(data, sort_fun(x))
}

f2(df)
#>           x
#> 1 0.7111425
#> 2 1.9946963

f2(df, 'desc')
#>           x
#> 1 1.9946963
#> 2 0.7111425

You can similarly look up expressions, which lets you avoid identity altogether:

f3 <- function(data, sort_order = c('asc', 'desc')){
    sort_order <- match.arg(sort_order)
    sort_expr <- list(asc = expr(x), desc = expr(desc(x)))[[sort_order]]
    arrange(df, !!sort_expr)
}

f3(df)
#>           x
#> 1 0.7111425
#> 2 1.9946963

f3(df, 'desc')
#>           x
#> 1 1.9946963
#> 2 0.7111425

Upvotes: 7

akuiper
akuiper

Reputation: 215137

How about moving the if / else statement into the arrange function:

stupid_func <- function(df, ascending=TRUE){
    ## does many things in reality, this is a toy example

    df %>% arrange(if(ascending) x else desc(x))
}

stupid_func(df)
#               x
#1  -1.4162465950
#2  -1.0428581093
#3  -0.3558181508
#4  -0.2366332875
#5   0.0003166344
#6   0.5146631983
#7   0.6390745275
#8   0.7459405376
#9   1.6161165230
#10  1.9243922633

stupid_func(df, ascending = FALSE)
#               x
#1   1.9243922633
#2   1.6161165230
#3   0.7459405376
#4   0.6390745275
#5   0.5146631983
#6   0.0003166344
#7  -0.2366332875
#8  -0.3558181508
#9  -1.0428581093
#10 -1.4162465950

Upvotes: 5

Related Questions