KirkH
KirkH

Reputation: 503

How to send unique cols of a dataframe to a custom function that handles vectors

Friends, I have a data frame from which I want to choose unique columns of three and send the chosen columns to a custom function that would return a column vector plus some scalars. I was wondering if there was a way to use concise code in R to do this. Let me now elaborate.

Suppose this is my data frame:

> data
  X1 X2 X3 X4 X5
1  1  5  9 13 17
2  2  6 10 14 18
3  3  7 11 15 19
4  4  8 12 16 20

I then create an index to choose all unique columns of three from this data frame:

> cols=combn(ncol(mat), 3)
> cols
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    1    1    1    1    1    2    2    2     3
[2,]    2    2    2    3    3    4    3    3    4     4
[3,]    3    4    5    4    5    5    4    5    5     5

And concatenate the names:

> nams<-apply(combn(colnames(data),3), 2, function(z) paste(z, collapse = ' '))
> nams
 [1] "X1 X2 X3" "X1 X2 X4" "X1 X2 X5" "X1 X3 X4" "X1 X3 X5" "X1 X4 X5" "X2 X3 X4"
 [8] "X2 X3 X5" "X2 X4 X5" "X3 X4 X5"

Now this is the part where I am not sure how to proceed. How to best write a custom function that operates on the three vectors sent to it and returns an object consisting of a vector and scalars, e.g. returns X1+X2+X3 as well as scalars such as mean and stdevs of X1, X2 and X3.

Suppose the following is the function:

someFunc <- function(subMat){
    someFunc$vector=subMat[,1]+subMat[,2]+subMat[,3] # return vector
    someFunc$mean1=mean(subMat[,1])
    someFunc$sd1=sd(subMat[,1])
    someFunc$mean2=mean(subMat[,2])
    someFunc$sd2=sd(subMat[,2])
    someFunc$mean3=mean(subMat[,3])
    someFunc$sd3=sd(subMat[,3])
    return(someFunc)
}

I guess I'm trying to figure out how custom functions in R could be built to handle and send vectors+scalars, hence consider the above function as my rough attempt.

Next step. In one of the other posts @Prasad Chalasani noted that the proper R way of sending such a function the vectors would involve using the apply function, but I can't seem to put the pieces together.

result <- apply( cols, 2, someFunc ........

Let me know if any of this is vague and I will do my best to clarify the problem further. In summary: I have a data frame, from which I want to send all unique columns of three to a function that then returns multiple results. Having trouble defining such a function and sending data to it using apply.

Upvotes: 0

Views: 118

Answers (1)

mnel
mnel

Reputation: 115392

You can pass a function combn, and this appears to be what you really want to do. In this case you don't want to simplify the results.

A list would appear to be what you want to return

You can use setNames to set the names of the result in a single line, using combn with paste to give a reasonable name

someFunc <- function(dd, .which){
  rr <- list(vec = rowSums(dd[,.which]))
  names(rr) <- paste('sum', paste(.which,collapse = '.'),sep='.')
  means <- lapply(dd[.which],mean)
  sds <- lapply(dd[.which], sd)

  return(c(rr, sd = sds, mean =means))
}
results <- setNames(combn(names(data),3, FUN = someFunc,dd = data, simplify = FALSE), 
                    combn(names(data),3, FUN = paste, collapse=':'))
head(results,2)
# $`X1:X2:X3`
# $`X1:X2:X3`$sum.X1.X2.X3
# 1  2  3  4 
# 15 18 21 24 
# 
# $`X1:X2:X3`$sd.X1
# [1] 1.290994
# 
# $`X1:X2:X3`$sd.X2
# [1] 1.290994
# 
# $`X1:X2:X3`$sd.X3
# [1] 1.290994
# 
# $`X1:X2:X3`$mean.X1
# [1] 2.5
# 
# $`X1:X2:X3`$mean.X2
# [1] 6.5
# 
# $`X1:X2:X3`$mean.X3
# [1] 10.5
# 
# 
# $`X1:X2:X4`
# $`X1:X2:X4`$sum.X1.X2.X4
# 1  2  3  4 
# 19 22 25 28 
# 
# $`X1:X2:X4`$sd.X1
# [1] 1.290994
# 
# $`X1:X2:X4`$sd.X2
# [1] 1.290994
# 
# $`X1:X2:X4`$sd.X4
# [1] 1.290994
# 
# $`X1:X2:X4`$mean.X1
# [1] 2.5
# 
# $`X1:X2:X4`$mean.X2
# [1] 6.5
# 
# $`X1:X2:X4`$mean.X4
# [1] 14.5

Upvotes: 0

Related Questions