user2702330
user2702330

Reputation: 321

use function in every two columns in R

Now I have a data set that looks like this:

> data
             a       b        c         d
[1,] 0.5943590 2.195610 0.5332164 1.3004142
[2,] 0.7635876 1.917823 0.9714945 1.3251010
[3,] 0.9942722 2.350122 1.2048159 1.1675700
[4,] 0.3736785 1.876318 0.9109197 0.8520509

And then I want to use a function for every two columns, for example,

F2<- function(x,y) (sum((x - y) ^ 2)) #define function
F2(data$a, data$b) #use function for first two columns
F2(data$a, data$c) #use function for first and third columns
F2(data$b, data$c) #use function for second and third columns
..................

How to use apply family to do this? Any help is greatly appreciated.

Upvotes: 3

Views: 438

Answers (1)

Roland
Roland

Reputation: 132676

That's a job for combn:

#some data
set.seed(42)
m <- matrix(rnorm(16),4)

F2<- function(x,y) (sum((x - y) ^ 2))

res <- matrix(NA, ncol(m), ncol(m))

res[lower.tri(res)] <- combn(ncol(m), 2, 
                             FUN=function(ind) F2(m[,ind[1]], m[,ind[2]]))

print(res)

#          [,1]     [,2]     [,3] [,4]
# [1,]       NA       NA       NA   NA
# [2,] 2.992875       NA       NA   NA
# [3,] 4.293073 8.320698       NA   NA
# [4,] 7.944818 6.484424 16.44946   NA

#for nicer printing
as.dist(res)

#           1         2         3
# 2  2.992875                    
# 3  4.293073  8.320698          
# 4  7.944818  6.484424 16.449463

And of course for this specific function you should better use dist, which is optimized for that kind of distance calculations:

dist(t(m))^2

#           1         2         3
# 2  2.992875                    
# 3  4.293073  8.320698          
# 4  7.944818  6.484424 16.449463

Upvotes: 7

Related Questions