Reputation: 546

Sort a data frame by group sum first, then by frequency

I have the following data in a data.frame:

reg        id    n
Echo       E1    7
Charlie    C1    3
Alpha      A1    5
Echo       E2    1
Alpha      A2    4
Charlie    C2    9

I am trying to sort it by reg sum first, then by n (both decreasing). The result would be:

reg        id    n
Charlie    C2    9
Charlie    C1    3
Alpha      A1    5
Alpha      A2    4
Echo       E1    7
Echo       E2    1

Because Charlie has 12, Alpha has 9 and Echo has 8.

So far, I've only been able to group and do the second part of the sorting (through poor use of dplyr). The reg column still gets sorted alphabetically.

Is there an elegant solution to this?

Upvotes: 3

Answers (4)

Sandipan Dey

Reputation: 23139

This will also work:

df[with(df, order(-ave(df$n, df$reg, FUN=sum), -n)),]
      reg id n
6 Charlie C2 9
2 Charlie C1 3
3   Alpha A1 5
5   Alpha A2 4
1    Echo E1 7
4    Echo E2 1

Upvotes: 0

akrun

Reputation: 887981

We can use data.table

library(data.table)
setDT(df1)[, n1:= sum(n) , by = reg][order(-n1, -n)][, n1:= NULL][]
#       reg id n
#1: Charlie C2 9
#2: Charlie C1 3
#3:   Alpha A1 5
#4:   Alpha A2 4
#5:    Echo E1 7
#6:    Echo E2 1

Upvotes: 1

Parfait

Reputation: 107767

Consider calculating the group sum with ave() then sort both columns, filtering out sum column:

df <- df[with(transform(df, sum=ave(df$n, df$reg, FUN=sum)), order(-sum, -n)), c(1:3)]
row.names(df) <- seq_len(nrow(df))

df

#       reg id n
# 1 Charlie C2 9
# 2 Charlie C1 3
# 3   Alpha A1 5
# 4   Alpha A2 4
# 5    Echo E1 7
# 6    Echo E2 1

Upvotes: 2

zx8754

Reputation: 56269

Using dplyr

# data
df1 <- read.table(text = "reg        id    n
Echo       E1    7
Charlie    C1    3
Alpha      A1    5
Echo       E2    1
Alpha      A2    4
Charlie    C2    9", header = TRUE)

library(dplyr)
df1 %>% 
  group_by(reg) %>% 
  mutate(nSum = sum(n)) %>% 
  arrange(-nSum, -n) %>% 
  select(-nSum)

# Source: local data frame [6 x 3]
# Groups: reg [3]
# 
#       reg     id     n
#    <fctr> <fctr> <int>
# 1 Charlie     C2     9
# 2 Charlie     C1     3
# 3   Alpha     A1     5
# 4   Alpha     A2     4
# 5    Echo     E1     7
# 6    Echo     E2     1

Upvotes: 2

Sort a data frame by group sum first, then by frequency

Answers (4)

Related Questions