tkerwin
tkerwin

Reputation: 9769

Problem passing data to function in R

I don't have much experience in R, so correct me if I'm making an elementary mistake:

I have a function:

ctree_result <- function(yval, training, testing) {
    print(yval)
    trained_tree <- ctree(formula = ordered(yval) ~ ., subset=training, data=ealls)
    print("here")
    tree_cor <- cor(yval[testing], as.numeric(predict(trained_tree, ealls[testing])))
    c_mat <- rbind(yval[testing], as.numeric(predict(trained_tree, ealls[testing])))
    tree_kappa <- cohen.kappa(t(c_mat))
    return(c(tree_cor, tree_kappa))
}

When I call it (with any data, but for example):

ctree_result(emean.data$mean.Shape, 1:70, 71:80)

I get the error Error in factor(x, ..., ordered = TRUE) : object 'yval' not found. However, the first print statement works, the vector is printed out. The second print statement never runs. yval doesn't seem to be getting passed through to ctree.

I can run the ctree function manually as:

yval <- emean.data$mean.Shape
sauc_tree = ctree(formula = ordered(yval) ~ . , data=ealls)

with no problems. ealls and emean.data are global datasets I define earlier.

Upvotes: 0

Views: 636

Answers (2)

Aniko
Aniko

Reputation: 18884

A flexible solution is to create a formula that contains the name of the variable that you are actually going to use. Here is a reproducible example using the lm function:

lm_result <- function(yvar){
  fla <- as.formula(paste(yvar, " ~ Species"))
  lm(fla, data=iris)
}

lm_result("Petal.Length")

Note that you have to pass the name of the variable instead of the variable itself for this approach.

Upvotes: 0

Shane
Shane

Reputation: 100204

Your problem is with the ctree function. The data ealls isn't being supplied from your parameters, so I presume that's a global dataset. The formula is looking for a field named yval in the ealls dataset. If you want to use the yval value from your function's parameter, then you should set that as the data field in ctree, and make sure that it has a named column for the formula.

An example of proper usage would be something like this (this is incomplete code):

ctree.result <- function(emean.data, ...) {
    trained_tree <- ctree(formula = ordered(mean.Shape) ~ ., subset=training, data=emean.data)
    ...
}

Where emean.data is your dataset with a column named mean.Shape.

I suggest that you look at help(ctree) and follow any supplied examples to see how that is supposed to be used.

Edit:

As discussed in chat, you can try to add the additional data into the dataset before calling ctree. The formula expects the data to be in the dataset.

Upvotes: 3

Related Questions