brokkoo
brokkoo

Reputation: 309

Counting observations per unique year in group in data.table

I want to count the number of unique years since the first observation in a data.table. Example below:

data <- structure(list(club = c("Ajax", "Ajax", "Ajax", "Ajax", "Ajax", 
"Ajax", "Barcelona", "Barcelona", "Barcelona"), years = c(1994, 
1994, 1994, 1995, 1995, 1995, 2014, 2014, 2015)), row.names = c(NA, 
-9L), class = c("data.table", "data.frame"))
        club years
1:      Ajax  1994
2:      Ajax  1994
3:      Ajax  1994
4:      Ajax  1995
5:      Ajax  1995
6:      Ajax  1995
7: Barcelona  2014
8: Barcelona  2014
9: Barcelona  2015

I want this:

        club years count
1:      Ajax  1994     1
2:      Ajax  1994     1
3:      Ajax  1994     1
4:      Ajax  1995     2
5:      Ajax  1995     2
6:      Ajax  1995     2
7: Barcelona  2014     1
8: Barcelona  2014     1
9: Barcelona  2015     2

Upvotes: 0

Views: 39

Answers (2)

hello_friend
hello_friend

Reputation: 5798

Base R solution:

df$count <- with(
  df,
  unlist(
    tapply(
      years,
      club, 
      FUN = function(x){
        as.integer(
          factor(
            x, 
            levels = unique(
              x
            )
          )
        )
      }
    )
  )
)

Upvotes: 0

akrun
akrun

Reputation: 887891

Use match on the unique 'years' by 'club' and assign (:=) the output to a new column

data[, count := match(years, unique(years)), by = club]

Or may use rleid

data[, count := rleid(years), club]

Upvotes: 1

Related Questions