Reputation: 1848
I have a data frame(df) such as
group value
a 4.2
a 4.5
a 6.2
b 5.1
b 3.5
a 4.2
a 5.1
b 6.4
b 3.3
b 4.1
a 5.0
The desired output is
group value
a 4.5
a 6.2
a 5.1
a 5.0
b 5.1
b 6.4
b 4.1
Namely, desired output extracts the smallest 2 "value"s of each "group". For example,
The desired output includes all rows of df except related rows to these values. How can I do that with R? I will be vey glad for any help. Thanks a lot.
Upvotes: 2
Views: 1304
Reputation: 132676
Here is a solution using package data.table and partial sorting:
library(data.table)
setDT(DF)
DF[, sort(value, partial = 2)[1:2], by = group]
# group V1
#1: a 4.2
#2: a 4.2
#3: b 3.3
#4: b 3.5
DF[, sort(value, partial = 2)[-(1:2)], by = group]
# group V1
#1: a 6.2
#2: a 4.5
#3: a 5.1
#4: a 5.0
#5: b 6.4
#6: b 5.1
#7: b 4.1
Of course, one of the many, many alternatives for split-apply-combine type operations could be used instead.
Upvotes: 4
Reputation: 886998
An option using dplyr
library(dplyr)
df %>%
group_by(group) %>%
arrange(value) %>%
slice(-(1:2))
Or
df %>%
group_by(group) %>%
filter(rank(value, ties.method='max')>2)
Upvotes: 2