Reputation: 3288
For example:
require(RevoScaleR)
# Create a data frame
set.seed(100)
myData = data.frame(x = 1:100, y = rep(c("a", "b", "c", "d"), 25),
z = rnorm(100), w = runif(100))
# Create a multi-block .xdf file from the data frame
inputFile = file.path(tempdir(), "testInput.xdf")
rxDataStep(inData = myData, outFile = inputFile, rowsPerRead = 50,
overwrite = TRUE)
# Square the values in the column "z"; this works fine
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = list(z = z^2))
# Define a squaring function and try to use it to repeat the previous step:
myFun = function(x) x^2
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = list(z = myFun(z)))
The final step crashes with the error
Error in transformation function: Error in eval(expr, envir, enclos) : could not find function "myFun"
The documentation for rxDataStep
states that "As with all expressions, transforms ... can be defined outside of the function call using the expression
function." But I have no idea how to implement this advice, and can't find an example. For instance, the following does not work:
myFun = expression(function(x) x^2)
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = list(z = myFun(z)))
Upvotes: 1
Views: 982
Reputation: 854
You can certainly pass an expression to transform
that was created outside of the function call.
It would look something like this:
myFun <- expression(
list(x2 = x^2,
z2 = z^2))
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = myFun)
If you want to pass a function as you have in your first example, it would look something like this:
myFun2 <- function(dataList){
dataList$x2 <- dataList$x^2
dataList$z2 <- dataList$z^2
dataList
}
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transformFunc = myFun2)
Upvotes: 2
Reputation: 3288
No idea why this works!
env <- new.env()
env$myFun <- function(x) x^2
rxDataStep(inData = inputFile, outFile = inputFile, overwrite = TRUE,
transforms = list(z = myFun(z)), transformEnvir=env)
Upvotes: 1