Reputation: 950
I a have small question regarding the data.table.
library(data.table)
data<-data.table(id=c(1,1,2,2,2),t=c(1,3,1,2,3),value_to_see=c(1,3,4,5,6))
data[,var_to_impute:=value_to_see[t==3],by=c("id")]
Here both for id=1 and id=2 for t=3 we have a value_to_see and we get the imputation correct.
id t value_to_see var_to_impute
1: 1 1 1 3
2: 1 3 3 3
3: 2 1 4 6
4: 2 2 5 6
5: 2 3 6 6
Now, assume that I accidentally do the following:
data[,var_to_impute:=value_to_see[t==2],by=c("id")]
id t value_to_see var_to_impute
1: 1 1 1 3
2: 1 3 3 3
3: 2 1 4 5
4: 2 2 5 5
5: 2 3 6 5
I expected to have var_to_impute = NA for id=1 but I get the previous value.
Whereas if I do:
data[,var_to_impute:=NULL]
data[,var_to_impute:=value_to_see[t==2],by=c("id")]
id t value_to_see var_to_impute
1: 1 1 1 NA
2: 1 3 3 NA
3: 2 1 4 5
4: 2 2 5 5
5: 2 3 6 5
Which is exactly what I expected. Can somebody give a hand on explaining what is going on here.
Upvotes: 1
Views: 48
Reputation: 20095
The behavior of data.table
observed by OP is expected behavior. Lets explain step by step.
library(data.table)
data<-data.table(id=c(1,1,2,2,2),t=c(1,3,1,2,3),value_to_see=c(1,3,4,5,6))
data[,var_to_impute:=value_to_see[t==2],by=c("id")] # value_to_see = 3 for 1st 2 rows
# The below statement will change values for id=2. Nothing will be changed for
# for id = 1. As condition t==2 is not matching for 'id==1'.
# Hence, for rows with 'id == 1' will remain unchanged.
data[,var_to_impute:=value_to_see[t==2],by=c("id")]
#Result
data
# id t value_to_see var_to_impute
# 1: 1 1 1 3 <- unchanged
# 2: 1 3 3 3 <- unchanged
# 3: 2 1 4 5
# 4: 2 2 5 5
# 5: 2 3 6 5
# 2nd scenario : Don't execute data[,var_to_impute:=value_to_see[t==2],by=c("id")]
data<-data.table(id=c(1,1,2,2,2),t=c(1,3,1,2,3),value_to_see=c(1,3,4,5,6))
data[,var_to_impute:=value_to_see[t==2],by=c("id")] #Step 3 directly
data
# id t value_to_see var_to_impute
# 1: 1 1 1 NA <- Nothing is there to assign. Hence NA
# 2: 1 3 3 NA <- Nothing is there to assign. Hence NA
# 3: 2 1 4 5
# 4: 2 2 5 5
# 5: 2 3 6 5
Upvotes: 1