Reputation: 376
x <- data.frame(names=c("peter","peter", "jacob", "jacob"),
some_score =c(5,8,6,8),
xDate = as.Date(c("2018-01-01", "2019-01-01" , "2018-06-08",
"2019-10-22"))
)
In this dataframe in need to aggregate so i get the latest score for each name (peter= 8 and Jacob=8)
Is there a fast way to do that? Right now i am creating two aggregate dataframes and linking them. But this seems inefficient
Upvotes: 1
Views: 62
Reputation: 5788
Base R one liner:
data.frame(do.call("rbind", lapply(split(x, x$names), function(x){x[which.max(x$xDate),]})),
row.names = NULL)
Upvotes: 1
Reputation: 51
the dplyr package is great option for your question.
x <- data.frame(names=c("peter","peter", "jacob", "jacob"),
some_score =c(5,8,6,8),
xDate = as.Date(c("2018-01-01", "2019-01-01" , "2018-06-08",
"2019-10-22")))
library(dplyr)
x %>%
group_by(names) %>%
summarise(max_some_score = max(some_score))
Upvotes: 0
Reputation: 1258
Another solution:
library(magrittr)
x %>%
dplyr::group_by(names) %>%
dplyr::filter(xDate == max(xDate))
Upvotes: 1
Reputation: 388807
We could get the row with max xDate
for each names
.
This can be done using dplyr
library(dplyr)
x %>% group_by(names) %>% slice(which.max(xDate))
# names some_score Date
# <chr> <dbl> <date>
#1 jacob 8 2019-10-22
#2 peter 8 2019-01-01
Or data.table
library(data.table)
setDT(x)[, .SD[which.max(xDate)], names]
Upvotes: 3
Reputation: 10375
> aggregate(x,list(x$names),tail,1)
Group.1 names some_score xDate
1 jacob jacob 8 2019-10-22
2 peter peter 8 2019-01-01
assuming your dataframe is sorted, otherwise sort it first by time.
Upvotes: 2