Reputation: 2022
I have a question about sapply
in R. In my example I'm using it for leave-one-out cross validation
##' Calculates the LOO CV score for given data and regression prediction function
##'
##' @param reg.data: regression data; data.frame with columns 'x', 'y'
##' @param reg.fcn: regr.prediction function; arguments:
##' reg.x: regression x-values
##' reg.y: regression y-values
##' x: x-value(s) of evaluation point(s)
##' value: prediction at point(s) x
##' @return LOOCV score
loocv <- function(reg.data, reg.fcn)
{
## Help function to calculate leave-one-out regression values
loo.reg.value <- function(i, reg.data, reg.fcn)
return(reg.fcn(reg.data$x[-i],reg.data$y[-i], reg.data$x[i]))
## Calculate LOO regression values using the help function above
n <- nrow(reg.data)
loo.values <- sapply(seq(1,n), loo.reg.value, reg.data, reg.fcn)
## Calculate and return MSE
return(???)
}
My questions about sapply
are the following:
sapply(X1,FUN1,X2,FUN2,..)
, where X1
and X2
are my function arguments for the function FUN1
and FUN2
respectively. 1:n
to the function loo.reg.value
. However, this function has multiple arguments, in fact 3: integer i
, regression data reg.data
and regression function reg.fcn
. If the function in sapply has more than one argument, and my X
covers just one of the arguments, does sapply use it as "first argument"? So it would be the same as sapply(c(1:n,reg.data,reg.fcn),loo.reg.value, reg.data, reg.fcn)
? Thank you for your help
Upvotes: 4
Views: 576
Reputation: 174803
In answer to the first question, Yes, you can use multiple functions, but second and subsequent functions need to be passed on to first function and then on to next function etc. Hence the functions need to be coded so as to take additional arguments and pass them on.
For example
foo <- function(x, f1, ...) f1(x, ...)
bar <- function(y, f2, ...) f2(y, ...)
foobar <- function(z, f3, ...) f3(z)
sapply(1:10, foo, f1 = bar, y = 2, f2 = foobar, z = 4, f3 = seq_len)
> sapply(1:10, foo, f1 = bar, y = 2, f2 = foobar, z = 4, f3 = seq_len)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 1 1 1 1
[2,] 2 2 2 2 2 2 2 2 2 2
[3,] 3 3 3 3 3 3 3 3 3 3
[4,] 4 4 4 4 4 4 4 4 4 4
This is a silly example but it shows how to pass on extra arguments to foo()
, initially, as part of the ...
argument of sapply()
. Also it shows how to have foo()
and subsequent functions take extra arguments to be passed on, simply via the use of ...
in the function definition and in how the next function is called, e.g. f2(y, ...)
. Note I also avoid issue with positional matching and name all the additional arguments supplied to foo()
.
With regard to question 2, I think the way you explain it is over-complicating things. You have, for example, duplicated the reg.data
and reg.fcn
bits in what R iterates over with sapply()
, which isn't correct (it implies you iterate over the 3 things in the vector c(1:n,reg.data,reg.fcn)
, not over 1:n
).
sapply(1:n, fun, arg1, arg2)
is equivalent to
fun(1, arg1, arg2)
fun(2, arg1, arg2)
....
fun(10, arg1, arg2)
whilst sapply(1:n, fun, arg1 = bar, arg2 = foobar)
is equivalent to
fun(1, arg1 = bar, arg2 = foobar)
fun(2, arg1 = bar, arg2 = foobar)
....
fun(10, arg1 = bar, arg2 = foobar)
Upvotes: 2
Reputation: 568
The function you pass to sapply
can take as many arguments as you'd like (within reason of course), but it will recycle all but the first arguments for every application. Have you tried running this code? It looks like it will work.
Upvotes: 1