Reputation: 585
I'd like to remove the duplicates from column x1 and x2 while keeping the higher value from x3.
DF:
x1 x2 x3
1 1 1
1 1 2
1 1 3
2 2 2
2 2 5
Expected result:
x1 x2 x3
1 1 3
2 2 5
I've gotten as far as df[!duplicated(df[,c(1,2)]),] but it's displaying the lowest value of x3. I'd like to get the highest x3 value.
Thanks ahead of time.
Upvotes: 3
Views: 3538
Reputation: 9618
Yet another alternative with data.table
:
library(data.table)
dt <- data.table(DF)
dt[,.SD[which.max(x3)],by=list(x1, x2)]
x1 x2 x3
1: 1 1 3
2: 2 2 5
Upvotes: 4
Reputation: 99331
You could aggregate()
, using the first two columns for grouping
aggregate(x3 ~ x1 + x2, df, max)
# x1 x2 x3
# 1 1 1 3
# 2 2 2 5
If you want to find the max in more than one column, you can add variables to the left hand side of the formula with cbind()
. For example,
aggregate(cbind(x3, x4, x5) ~ x1 + x2, df, max)
Upvotes: 7
Reputation: 23014
Using the dplyr
package:
library(dplyr)
df %>% group_by(x1,x2) %>% summarise(x3 = max(x3))
You could title the maximum variable "maxOfx3" or similar for clarity.
Edit: If you have additional variables whose maxima you want, you can include them in the summarise()
call:
df %>% group_by(x1,x2) %>% summarise(x3 = max(x3), x4 = max(x4), avg_of_x5 = mean(x5))
etc.
Upvotes: 5