Reputation: 321
Now I have a data set that looks like this:
> data
a b c d
[1,] 0.5943590 2.195610 0.5332164 1.3004142
[2,] 0.7635876 1.917823 0.9714945 1.3251010
[3,] 0.9942722 2.350122 1.2048159 1.1675700
[4,] 0.3736785 1.876318 0.9109197 0.8520509
And then I want to use a function for every two columns, for example,
F2<- function(x,y) (sum((x - y) ^ 2)) #define function
F2(data$a, data$b) #use function for first two columns
F2(data$a, data$c) #use function for first and third columns
F2(data$b, data$c) #use function for second and third columns
..................
How to use apply family to do this? Any help is greatly appreciated.
Upvotes: 3
Views: 438
Reputation: 132676
That's a job for combn
:
#some data
set.seed(42)
m <- matrix(rnorm(16),4)
F2<- function(x,y) (sum((x - y) ^ 2))
res <- matrix(NA, ncol(m), ncol(m))
res[lower.tri(res)] <- combn(ncol(m), 2,
FUN=function(ind) F2(m[,ind[1]], m[,ind[2]]))
print(res)
# [,1] [,2] [,3] [,4]
# [1,] NA NA NA NA
# [2,] 2.992875 NA NA NA
# [3,] 4.293073 8.320698 NA NA
# [4,] 7.944818 6.484424 16.44946 NA
#for nicer printing
as.dist(res)
# 1 2 3
# 2 2.992875
# 3 4.293073 8.320698
# 4 7.944818 6.484424 16.449463
And of course for this specific function you should better use dist
, which is optimized for that kind of distance calculations:
dist(t(m))^2
# 1 2 3
# 2 2.992875
# 3 4.293073 8.320698
# 4 7.944818 6.484424 16.449463
Upvotes: 7