Reputation: 3919
I'm currently doing some tests with the set
function in the data.table
package in R
and have the following code:
dt= data.table(ans=rep(c(14,16),100))
dt[,voy:=0.0]
set(dt,which(dt[,ans]==14),"voy",log(dt[,ans]))
dt
Note that I want to compute the logarithm of those cases having ans=14
using the set
function, but I'm not getting the correct result. This is the result I got:
ans voy
1: 14 2.639057
2: 16 0.000000
3: 14 2.772589
4: 16 0.000000
5: 14 2.639057
---
196: 16 0.000000
197: 14 2.639057
198: 16 0.000000
199: 14 2.772589
200: 16 0.000000
You may note that for some rows the value of the variable voy
is the expected log(14)=2.639057
but for others cases having ans=14
it is assigned 2.772589=log(16)
. So, I think I'm misusing the set
function. How can I solve this?
I know the next code can be used to carry this out:
dt[ans==14,voy:=log(ans)]
But I want to translate this into the set
function syntax.
Upvotes: 2
Views: 2308
Reputation: 4223
You need to subset the data for the value parameter. In your case, the warning Supplied 200 items to be assigned to 100 items of column 'voy' (100 unused) could have given you an idea. You were picking one by one the first 100 values of dt$ans
, which indeed are alternating 14's and 16's.
This way it works:
set(dt,which(dt[,ans]==14),"voy",log(dt[ans==14,ans]))
giving:
ans voy
1: 14 2.639057
2: 16 0.000000
3: 14 2.639057
4: 16 0.000000
5: 14 2.639057
---
196: 16 0.000000
197: 14 2.639057
198: 16 0.000000
199: 14 2.639057
200: 16 0.000000
But it's ugly code, as @Andrie already remarked.
Upvotes: 4