conditionally changing contents of a columns using max() in data.table in R

Question

I have a data.table with the following info:

   data.table(id = c(rep(1,5)),
               year = c(rep(2015,3), rep(2016,2)), 
               class = c(rep("A", 3), rep("B", 2)),
               origin = c("Europe", "Asia", "Africa", "Europe", "Asia"), 
               count = c(30299, 3, 34, 2, 800))

   id year class origin count
1:  1 2015     A Europe 30299
2:  1 2015     A   Asia     3
3:  1 2015     A Africa    34
4:  1 2016     B Europe     2
5:  1 2016     B   Asia   800

However, fofr every id, year, class only one location is admissible. Here, the first combination has three locations:

1:  1 2015     A Europe 30299
2:  1 2015     A   Asia     3
3:  1 2015     A Africa    34

and the second combination has two locations:

4:  1 2016     B Europe     2
5:  1 2016     B   Asia   800

I want to change the locations, such that for every id, year, class combination the location with the highest count will be used. This should result in a table like this:

   id year class origin count
1:  1 2015     A Europe 30299
2:  1 2015     A Europe     3
3:  1 2015     A Europe    34
4:  1 2016     B   Asia     2
5:  1 2016     B   Asia   800

How can this be achieved? I was thinking of splitting tha data table in a list fo lists and then applying lapply, but i am sure there is a better/simpßler solution? any tipps?

s_baldur · Accepted Answer

DT[, origin := origin[which.max(count)], by = .(id, year, class)]

conditionally changing contents of a columns using max() in data.table in R

Answers (2)

Related Questions