hmhensen
hmhensen

Reputation: 3195

Sort/arrange character column with condition

I am trying to use dplyr's arrange to sort character strings based on a condition. I'd like to arrange on a column, but in ascending order if a second column is equal to one value and in descending order if the second column is equal to another value.

I found a couple similar questions (one of them one of my own previous questions), but they weren't directly applicable. Both of them sorted on numeric columns and they wouldn't work on character columns.

R - Conditionally sort multiple columns as ascending or descending by group

Sort/arrange within group for only chosen groups

Here's sample data:

df <- mtcars %>% as_tibble(rownames = "model") %>% select(model, mpg, cyl) %>% slice(1:8) %>% mutate(cond = c(rep("A", 4), rep("B", 4)))

If cond == "A", I want to sort "model" in ascending order. If cond == "B", I want to sort "model" in descending order. I'd like a solution that doesn't rely on bind_rows or a similar solution.

Possibly something in the form of:

library(dplyr)
df %>% arrange(ifelse(cond == "A", model, desc(model)))

The desired solution looks like:

## A tibble: 8 x 8
#  model              mpg   cyl cond 
#  <chr>              <dbl> <dbl> <chr>
#1 Datsun 710         22.8     4 A    
#2 Hornet 4 Drive     21.4     6 A    
#3 Mazda RX4          21       6 A    
#4 Mazda RX4 Wag      21       6 A    
#5 Valiant            18.1     6 B    
#6 Merc 240D          24.4     4 B    
#7 Hornet Sportabout  18.7     8 B    
#8 Duster 360         14.3     8 B    

Upvotes: 3

Views: 1349

Answers (4)

TimTeaFan
TimTeaFan

Reputation: 18541

This can be done with dplyr::arrange and ifelse. The trick is that {dplyr}'s desc function has a different output which is not the same type as its input:

ifelse(df$cond == "B", dplyr::desc(df$model), df$model)
#> [1] "Mazda RX4"      "Mazda RX4 Wag"  "Datsun 710"     "Hornet 4 Drive"
#> [5] "-4"             "-8"             "-2"             "-7"

To make desc() inside an ifelse work the easist way is to define a function asc() which does just the opposite of desc():

asc <- function(x) {
  xtfrm(x)
}

Now we can use both asc() and desc() in the ifelse statement to return the desired ordering.

library(dplyr)

df %>% 
  arrange(cond, ifelse(cond == "B", desc(model), asc(model)))

#> # A tibble: 8 x 4
#>   model               mpg   cyl cond 
#>   <chr>             <dbl> <dbl> <chr>
#> 1 Datsun 710         22.8     4 A    
#> 2 Hornet 4 Drive     21.4     6 A    
#> 3 Mazda RX4          21       6 A    
#> 4 Mazda RX4 Wag      21       6 A    
#> 5 Valiant            18.1     6 B    
#> 6 Merc 240D          24.4     4 B    
#> 7 Hornet Sportabout  18.7     8 B    
#> 8 Duster 360         14.3     8 B

Created on 2022-08-21 by the reprex package (v2.0.1)

Upvotes: 1

akrun
akrun

Reputation: 886938

We may use group_modify and group_by

library(dplyr)
df %>% 
    group_by(cond) %>%
    group_modify(~.x %>% 
              arrange(if(.y == 'A') model else desc(model)) ) %>%
    ungroup %>%
    select(names(df))

-output

# A tibble: 8 x 4
  model               mpg   cyl cond 
  <chr>             <dbl> <dbl> <chr>
1 Datsun 710         22.8     4 A    
2 Hornet 4 Drive     21.4     6 A    
3 Mazda RX4          21       6 A    
4 Mazda RX4 Wag      21       6 A    
5 Valiant            18.1     6 B    
6 Merc 240D          24.4     4 B    
7 Hornet Sportabout  18.7     8 B    
8 Duster 360         14.3     8 B   

It is also possible to subset the values of 'model' sort it separately and specify those as levels in factor to be get arranged accordingly

df %>%  
   arrange(factor(model, levels = c(sort(model[cond == 'A']), 
           sort(model[cond != "A"], decreasing = TRUE))))

-output

# A tibble: 8 x 4
  model               mpg   cyl cond 
  <chr>             <dbl> <dbl> <chr>
1 Datsun 710         22.8     4 A    
2 Hornet 4 Drive     21.4     6 A    
3 Mazda RX4          21       6 A    
4 Mazda RX4 Wag      21       6 A    
5 Valiant            18.1     6 B    
6 Merc 240D          24.4     4 B    
7 Hornet Sportabout  18.7     8 B    
8 Duster 360         14.3     8 B    

Upvotes: 3

Noah
Noah

Reputation: 4414

As I find is often the case, a base R solution is much simpler:

split(df, df$cond) <- lapply(split(df, df$cond), function(d) {
    d[order(d$model, decreasing = all(d$cond == "B")),]
})

This splits the dataset by condition, then orders each split based on whether the condition is "A" or "B", and then re-inserts the splits into their respective places.

Upvotes: 1

Kra.P
Kra.P

Reputation: 15123

It's pretty messy, but it works.

mtcars %>% 
  as_tibble(rownames = "model") %>% 
  select(model, mpg, cyl) %>% 
  slice(1:8) %>% 
  mutate(cond = c(rep("A", 4), rep("B", 4))) %>%
  group_by(cond) %>%
  arrange(model, .by_group = TRUE) %>%
  add_rownames(., var = "rowna") %>%
  mutate(rowna = as.numeric(rowna)) %>%
  rowwise() %>%
  mutate(idx = ifelse(cond == "A", rowna, 8 - rowna)) %>%
  group_by(cond) %>%
  arrange(idx, .by_group = TRUE) %>%
  select(-idx, -rowna)

  model               mpg   cyl cond 
  <chr>             <dbl> <dbl> <chr>
1 Datsun 710         22.8     4 A    
2 Hornet 4 Drive     21.4     6 A    
3 Mazda RX4          21       6 A    
4 Mazda RX4 Wag      21       6 A    
5 Valiant            18.1     6 B    
6 Merc 240D          24.4     4 B    
7 Hornet Sportabout  18.7     8 B    
8 Duster 360         14.3     8 B 

Upvotes: 2

Related Questions