Svenja
Svenja

Reputation: 119

Subset a data frame for date and keep all later dates in data.table

I have the following data.table

DT <- data.table(id=c(1,1,1,2,2,2,2),
                 date1=c("2013-11-22","2013-11-22","2013-11-22","2015-12-22","2015-12-22","2010-12-22","2010-12-22"),
                 variable=c(20.5,22.1,21.9,30.3,30.5,28.7,27.9))

I only want to keep the rows with the later dates to finally have this data frame

id      date1 variable
1:  1 2013-11-22     20.5
2:  1 2013-11-22     22.1
3:  1 2013-11-22     21.9
4:  2 2015-12-22     30.3
5:  2 2015-12-22     30.5

I am using data.table and tried something like this

setDT(DT)[,.SD[which.max(date1)],by='id']

But it only gives me the output for one row per id. I want to keep the rows with all the later dates.

Upvotes: 1

Views: 37

Answers (2)

arg0naut91
arg0naut91

Reputation: 14764

You could do:

DT[, .SD[date1 == max(date1)], by = 'id']

Output:

   id      date1 variable
1:  1 2013-11-22     20.5
2:  1 2013-11-22     22.1
3:  1 2013-11-22     21.9
4:  2 2015-12-22     30.3
5:  2 2015-12-22     30.5

Upvotes: 1

akrun
akrun

Reputation: 887391

We can use .I

DT[DT[, .I[date1 == max(date1)], id]$V1]
#  id      date1 variable
#1:  1 2013-11-22     20.5
#2:  1 2013-11-22     22.1
#3:  1 2013-11-22     21.9
#4:  2 2015-12-22     30.3
#5:  2 2015-12-22     30.5

Upvotes: 1

Related Questions