Reputation: 359
I have a dataset as follows
Id Date Subject Score
12221 08/01/2007 Math 89
12221 08/01/2007 Math 92
12221 08/01/2007 Math 78
12221 11/01/2007 Math 36
12221 11/01/2007 Math 45
12221 11/01/2007 Math 24
2856 03/18/2004 Science 56
2856 03/18/2004 Science 49
2856 03/18/2004 Science 84
I am trying to retain only rows where the Score is maximum for a combination of Id, Date and Subject for example the final output should look like this below
Id Date Subject Score
12221 08/01/2007 Math 92
12221 11/01/2007 Math 45
2856 03/18/2004 Science 84
I tried aggregate
function,
aggregate(score ~ list(Id,Date,Subject), df, max)
this did not work, tried dcast
, which.max
etc neither of these are producing the desired results, any help on resolving this issue is much appreciated.
Upvotes: 1
Views: 1526
Reputation: 887391
We can use aggregate
aggregate(Score~., df1, FUN= max)
# Id Date Subject Score
#1 12221 08/01/2007 Math 92
#2 12221 11/01/2007 Math 45
#3 2856 03/18/2004 Science 84
Or with dplyr
library(dplyr)
df1 %>%
group_by(Id, Date, Subject) %>%
summarise(Score= max(Score))
Or using data.table
library(data.table)
setDT(df1)[, list(Score= max(Score)), by = .(Id, Date, Subject)]
Upvotes: 2