Filtering a data frame according to values rank

Question

I have a data frame(df) such as

group  value
a      4.2
a      4.5
a      6.2
b      5.1
b      3.5
a      4.2
a      5.1
b      6.4
b      3.3
b      4.1
a      5.0

The desired output is

group  value
a      4.5
a      6.2  
a      5.1
a      5.0
b      5.1
b      6.4
b      4.1

Namely, desired output extracts the smallest 2 "value"s of each "group". For example,

4.2 and 4.2 are the smallest two values of group a, and
3.5 and 3.3 are the two smallest values of group b.

The desired output includes all rows of df except related rows to these values. How can I do that with R? I will be vey glad for any help. Thanks a lot.

Roland · Accepted Answer

Here is a solution using package data.table and partial sorting:

library(data.table)
setDT(DF)
DF[, sort(value, partial = 2)[1:2], by = group]
#   group  V1
#1:     a 4.2
#2:     a 4.2
#3:     b 3.3
#4:     b 3.5

DF[, sort(value, partial = 2)[-(1:2)], by = group]
#   group  V1
#1:     a 6.2
#2:     a 4.5
#3:     a 5.1
#4:     a 5.0
#5:     b 6.4
#6:     b 5.1
#7:     b 4.1

Of course, one of the many, many alternatives for split-apply-combine type operations could be used instead.

Filtering a data frame according to values rank

Answers (2)

Related Questions