pdubois
pdubois

Reputation: 7800

Sort dataframe by column group length in R

I have the following dataframe:

v2 <- c(4.5, 2.5, 3.5, 5.5, 7.5, 6.5, 2.5, 1.5, 3.5)
v1 <- c(2.2, 3.2, 1.2, 4.2, 2.2, 3.2, 2.2, 1.2, 5.2)
lvl <- c("a","a","b","b","b","b","c","c","c")
d <- data.frame(v1,v2,lvl)
d

   v1  v2 lvl
1 2.2 4.5   a
2 3.2 2.5   a
3 1.2 3.5   b
4 4.2 5.5   b
5 2.2 7.5   b
6 3.2 6.5   b
7 2.2 2.5   c
8 1.2 1.5   c
9 5.2 3.5   c

What I want to do is to sort the dataframe based on the size of lvl grouping. Yielding the following result:

v1  v2   lvl
2.2 4.5   a
3.2 2.5   a
2.2 2.5   c
1.2 1.5   c
5.2 3.5   c
1.2 3.5   b
4.2 5.5   b
2.2 7.5   b
3.2 6.5   b

Because, a has length 2, c length 3, b length 4.

How can that be achieved?

Upvotes: 3

Views: 110

Answers (2)

thelatemail
thelatemail

Reputation: 93938

Using ave:

d[order(ave(seq(d$lvl),d$lvl,FUN=length)),]

#   v1  v2 lvl
#1 2.2 4.5   a
#2 3.2 2.5   a
#7 2.2 2.5   c
#8 1.2 1.5   c
#9 5.2 3.5   c
#3 1.2 3.5   b
#4 4.2 5.5   b
#5 2.2 7.5   b
#6 3.2 6.5   b

This works by ave assigning the length of the whole d$lvl group against each d$lvl value. The seq just generates a placeholder numeric variable which ave can run along with a guarantee of no errors.

ave(seq(d$lvl),d$lvl,FUN=length)
#[1] 2 2 4 4 4 4 3 3 3
order(ave(seq(d$lvl),d$lvl,FUN=length))
#[1] 1 2 7 8 9 3 4 5 6
#    a a c c c b b b b

Upvotes: 3

DatamineR
DatamineR

Reputation: 9628

d$lvl <- factor(d$lvl, levels = names(sort(table(d$lvl))))
d[order(d$lvl),]
   v1  v2 lvl
1 2.2 4.5   a
2 3.2 2.5   a
7 2.2 2.5   c
8 1.2 1.5   c
9 5.2 3.5   c
3 1.2 3.5   b
4 4.2 5.5   b
5 2.2 7.5   b
6 3.2 6.5   b

Upvotes: 4

Related Questions