Reputation: 143
This is what I usually do every time I want to create a new column in a data.frame, the new column being the result of a function applied to different subsets of my data.frame.
What do you think is the best way to get the same result using data.table package in R?
Cheers!
> class(DF)
[1] "data.frame"
> names(DF)
[1] "sp" "X1" "X2"
paramsVal <- c(0.32, 0.23, 8.28, 8.37)
DF <- split(DF, DF$sp)
DF <- lapply(seq_along(DF), function(X){
Data <- DF[[X]]
if(unique(X$sp) == "SP1"){
Data$Pred <- fakeFunction(Data = Data,
param1 = paramsVal[1],
param2 = paramsVal[3])
}else{
Data$Pred <- fakeFunction(Data = Data,
param1 = paramsVal[2],
param2 = paramsVal[4])
}
return(Data)
})
DF <- do.call("rbind", DF)
names(DF)
[1] "sp" "X1" "X2" "Pred"
Upvotes: 0
Views: 259
Reputation: 145975
With data.table
, I would do this:
DT = as.data.table(DF)
DT[sp == "SP1", Pred := fakeFunction(Data = .SD, param1 = paramsVal[1], param2 = paramsVal[3])]
DT[sp != "SP1", Pred := fakeFunction(Data = .SD, param1 = paramsVal[2], param2 = paramsVal[4])]
I think this should work, but I can't test without a reproducible example. If you need more assistance, please provide (a) a copy/pasteable sample of the data (just a couple rows each of SP1
and not SP1
- use dput()
for reproducibility), and (b) a stand-in for fakeFunction
, parmsVal
, and anything else needed for the example to run.
Upvotes: 1