oercim
oercim

Reputation: 1848

Filtering a data frame according to values rank

I have a data frame(df) such as

group  value
a      4.2
a      4.5
a      6.2
b      5.1
b      3.5
a      4.2
a      5.1
b      6.4
b      3.3
b      4.1
a      5.0 

The desired output is

group  value
a      4.5
a      6.2  
a      5.1
a      5.0
b      5.1
b      6.4
b      4.1

Namely, desired output extracts the smallest 2 "value"s of each "group". For example,

The desired output includes all rows of df except related rows to these values. How can I do that with R? I will be vey glad for any help. Thanks a lot.

Upvotes: 2

Views: 1304

Answers (2)

Roland
Roland

Reputation: 132676

Here is a solution using package data.table and partial sorting:

library(data.table)
setDT(DF)
DF[, sort(value, partial = 2)[1:2], by = group]
#   group  V1
#1:     a 4.2
#2:     a 4.2
#3:     b 3.3
#4:     b 3.5

DF[, sort(value, partial = 2)[-(1:2)], by = group]
#   group  V1
#1:     a 6.2
#2:     a 4.5
#3:     a 5.1
#4:     a 5.0
#5:     b 6.4
#6:     b 5.1
#7:     b 4.1

Of course, one of the many, many alternatives for split-apply-combine type operations could be used instead.

Upvotes: 4

akrun
akrun

Reputation: 886998

An option using dplyr

library(dplyr)
 df %>% 
   group_by(group) %>%
   arrange(value) %>%
   slice(-(1:2))

Or

 df %>%
   group_by(group) %>% 
   filter(rank(value, ties.method='max')>2) 

Upvotes: 2

Related Questions