Reputation: 503
Friends, I have a data frame from which I want to choose unique columns of three and send the chosen columns to a custom function that would return a column vector plus some scalars. I was wondering if there was a way to use concise code in R to do this. Let me now elaborate.
Suppose this is my data frame:
> data
X1 X2 X3 X4 X5
1 1 5 9 13 17
2 2 6 10 14 18
3 3 7 11 15 19
4 4 8 12 16 20
I then create an index to choose all unique columns of three from this data frame:
> cols=combn(ncol(mat), 3)
> cols
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 1 1 1 1 2 2 2 3
[2,] 2 2 2 3 3 4 3 3 4 4
[3,] 3 4 5 4 5 5 4 5 5 5
And concatenate the names:
> nams<-apply(combn(colnames(data),3), 2, function(z) paste(z, collapse = ' '))
> nams
[1] "X1 X2 X3" "X1 X2 X4" "X1 X2 X5" "X1 X3 X4" "X1 X3 X5" "X1 X4 X5" "X2 X3 X4"
[8] "X2 X3 X5" "X2 X4 X5" "X3 X4 X5"
Now this is the part where I am not sure how to proceed. How to best write a custom function that operates on the three vectors sent to it and returns an object consisting of a vector and scalars, e.g. returns X1+X2+X3 as well as scalars such as mean and stdevs of X1, X2 and X3.
Suppose the following is the function:
someFunc <- function(subMat){
someFunc$vector=subMat[,1]+subMat[,2]+subMat[,3] # return vector
someFunc$mean1=mean(subMat[,1])
someFunc$sd1=sd(subMat[,1])
someFunc$mean2=mean(subMat[,2])
someFunc$sd2=sd(subMat[,2])
someFunc$mean3=mean(subMat[,3])
someFunc$sd3=sd(subMat[,3])
return(someFunc)
}
I guess I'm trying to figure out how custom functions in R could be built to handle and send vectors+scalars, hence consider the above function as my rough attempt.
Next step. In one of the other posts @Prasad Chalasani noted that the proper R way of sending such a function the vectors would involve using the apply function, but I can't seem to put the pieces together.
result <- apply( cols, 2, someFunc ........
Let me know if any of this is vague and I will do my best to clarify the problem further. In summary: I have a data frame, from which I want to send all unique columns of three to a function that then returns multiple results. Having trouble defining such a function and sending data to it using apply.
Upvotes: 0
Views: 118
Reputation: 115392
You can pass a function combn, and this appears to be what you really want to do. In this case you don't want to simplify the results.
A list
would appear to be what you want to return
You can use setNames
to set the names of the result in a single line, using combn
with paste
to give a reasonable name
someFunc <- function(dd, .which){
rr <- list(vec = rowSums(dd[,.which]))
names(rr) <- paste('sum', paste(.which,collapse = '.'),sep='.')
means <- lapply(dd[.which],mean)
sds <- lapply(dd[.which], sd)
return(c(rr, sd = sds, mean =means))
}
results <- setNames(combn(names(data),3, FUN = someFunc,dd = data, simplify = FALSE),
combn(names(data),3, FUN = paste, collapse=':'))
head(results,2)
# $`X1:X2:X3`
# $`X1:X2:X3`$sum.X1.X2.X3
# 1 2 3 4
# 15 18 21 24
#
# $`X1:X2:X3`$sd.X1
# [1] 1.290994
#
# $`X1:X2:X3`$sd.X2
# [1] 1.290994
#
# $`X1:X2:X3`$sd.X3
# [1] 1.290994
#
# $`X1:X2:X3`$mean.X1
# [1] 2.5
#
# $`X1:X2:X3`$mean.X2
# [1] 6.5
#
# $`X1:X2:X3`$mean.X3
# [1] 10.5
#
#
# $`X1:X2:X4`
# $`X1:X2:X4`$sum.X1.X2.X4
# 1 2 3 4
# 19 22 25 28
#
# $`X1:X2:X4`$sd.X1
# [1] 1.290994
#
# $`X1:X2:X4`$sd.X2
# [1] 1.290994
#
# $`X1:X2:X4`$sd.X4
# [1] 1.290994
#
# $`X1:X2:X4`$mean.X1
# [1] 2.5
#
# $`X1:X2:X4`$mean.X2
# [1] 6.5
#
# $`X1:X2:X4`$mean.X4
# [1] 14.5
Upvotes: 0