Reputation: 33
I've got pannel data in the following form:
DT = data.table(Id=1:5, Time1=c(0,0,1,2,0), Time2=c(1,0,0,0,0), Time3=c(0,1,2,0,0), Time4=c(0,1,2,5,4))
Id Time1 Time2 Time3 Time4
1 0 1 0 0
2 0 0 1 1
3 1 0 2 2
4 2 0 0 5
5 0 0 0 4
I want to extrac the first time for every ID which is not 0. The dataset is large, therefore I was looking for a solution with data.table.
I tried something using which.min(which!=0), but could work it out.
The output should look as follows:
Id Time1 Time2 Time3 Time4 Output
1 0 1 0 0 2
2 0 0 1 1 3
3 1 0 2 2 1
4 2 0 0 5 1
5 0 0 0 4 4
Thanks for your help.
Upvotes: 3
Views: 1305
Reputation: 99351
You could do
DT[, Output := which.max(.SD != 0), by = Id][]
# Id Time1 Time2 Time3 Time4 Output
# 1: 1 0 1 0 0 2
# 2: 2 0 0 1 1 3
# 3: 3 1 0 2 2 1
# 4: 4 2 0 0 5 1
# 5: 5 0 0 0 4 4
which.max
finds the index of the first occurrence of the maximum value. So when we do .SD != 0
it finds the first occurrence of TRUE
.
Equivalently, we could also use
DT[, Output := match(TRUE, .SD != 0), by = Id]
# or
DT[, Output := which(.SD != 0)[1], by = Id]
Upvotes: 3