Reputation: 309
I want to count the number of unique years since the first observation in a data.table. Example below:
data <- structure(list(club = c("Ajax", "Ajax", "Ajax", "Ajax", "Ajax",
"Ajax", "Barcelona", "Barcelona", "Barcelona"), years = c(1994,
1994, 1994, 1995, 1995, 1995, 2014, 2014, 2015)), row.names = c(NA,
-9L), class = c("data.table", "data.frame"))
club years
1: Ajax 1994
2: Ajax 1994
3: Ajax 1994
4: Ajax 1995
5: Ajax 1995
6: Ajax 1995
7: Barcelona 2014
8: Barcelona 2014
9: Barcelona 2015
I want this:
club years count
1: Ajax 1994 1
2: Ajax 1994 1
3: Ajax 1994 1
4: Ajax 1995 2
5: Ajax 1995 2
6: Ajax 1995 2
7: Barcelona 2014 1
8: Barcelona 2014 1
9: Barcelona 2015 2
Upvotes: 0
Views: 39
Reputation: 5798
Base R solution:
df$count <- with(
df,
unlist(
tapply(
years,
club,
FUN = function(x){
as.integer(
factor(
x,
levels = unique(
x
)
)
)
}
)
)
)
Upvotes: 0
Reputation: 887891
Use match
on the unique
'years' by 'club' and assign (:=
) the output to a new column
data[, count := match(years, unique(years)), by = club]
Or may use rleid
data[, count := rleid(years), club]
Upvotes: 1