Carlos Luis Rivera
Carlos Luis Rivera

Reputation: 3693

Arrange multiple columns with characters by frequency using fct_infreq()

I would like to arrange multiple columns, both of which contain characters, using forcats::fct_infreq() in dplyr::arrange(). Here, I would like to sort column z and x of a data frame df by their frequency. I can manage to sort column z but I cannot manipulate column x properly.

MWE

#### To load tibble, dplyr, and forcats ####
library(tidyverse)

#### data creation ####
df <- tibble(
  x = c(
    "are", "kore", "sore", "are", "kore", "are", "kore",
    "sore", "sore", "kore", "kore", "kore", "kore", "kore"
    ), 
  y = c(
    2, 1, 2, 3, 2, 3, 1, 
    2, 2, 1, 1, 1, 1, 1
    ), 
  z = c(
    "foo", "foo", "foo", "bar", "bar", "bar", "baz",
    "baz", "baz", "qux", "qux", "qux", "qux", "qux"
    )
)

df %>% 
  arrange(
    fct_infreq(z), # only column z is sorted
    fct_infreq(x)  # column x is not desirably sorted 
  )

Output of the code above

# A tibble: 14 x 3
   x         y z    
   <chr> <dbl> <chr>
 1 kore      1 qux  
 2 kore      1 qux  
 3 kore      1 qux  
 4 kore      1 qux  
 5 kore      1 qux  
 6 kore      2 bar  #  <- `are` is more frequent in the group of `bar` than `kore`
 7 are       3 bar  #
 8 are       3 bar  #
 9 kore      1 baz  ## <- `sore` is more frequent in the group of `baz` than `kore`
10 sore      2 baz  ##
11 sore      2 baz  ##
12 kore      1 foo  
13 are       2 foo  
14 sore      2 foo  

Desirable output

# A tibble: 14 x 3
   x         y z    
   <chr> <dbl> <chr>
 1 kore      1 qux  
 2 kore      1 qux  
 3 kore      1 qux  
 4 kore      1 qux  
 5 kore      1 qux  
 6 are       3 bar  #  <- `are` is more frequent in the group of `bar` than `kore`
 7 are       3 bar  #
 8 kore      2 bar  #
 9 sore      2 baz  ## <- `sore` is more frequent in the group of `baz` than `kore`
10 sore      2 baz  ##
11 kore      1 baz  ##
12 kore      1 foo  
13 are       2 foo  
14 sore      2 foo  

How should I achieve this using forcats::fct_infreq() and dplyr::arrange()?

Upvotes: 0

Views: 132

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389325

I have not used fct_infreq, here's a different approach -

library(dplyr)

df %>%
  count(z, x, sort = TRUE) %>%
  arrange(match(z, unique(z))) %>%
  left_join(df, by = c('z', 'x')) %>%
  select(-n)

#    z     x        y
#   <chr> <chr> <dbl>
# 1 qux   kore      1
# 2 qux   kore      1
# 3 qux   kore      1
# 4 qux   kore      1
# 5 qux   kore      1
# 6 bar   are       3
# 7 bar   are       3
# 8 bar   kore      2
# 9 baz   sore      2
#10 baz   sore      2
#11 baz   kore      1
#12 foo   are       2
#13 foo   kore      1
#14 foo   sore      2

Upvotes: 1

Related Questions