Reputation: 936
I have a matrix of n variables and I want to make an new matrix that is a pairwise difference of each vector, but not of itself. Here is an example of the data.
Transportation.services Recreational.goods.and.vehicles Recreation.services Other.services
2.958003 -0.25983789 5.526694 2.8912009
2.857370 -0.03425164 5.312857 2.9698044
2.352275 0.30536569 4.596742 2.9190123
2.093233 0.65920773 4.192716 3.2567390
1.991406 0.92246531 3.963058 3.6298314
2.065791 1.06120930 3.692287 3.4422340
I tried running a for loop below, but I'm aware that R is very slow with loops.
Difference.Matrix<- function(data){
n<-2
new.cols="New Columns"
list = list()
for (i in 1:ncol(data)){
for (j in n:ncol(data)){
name <- paste("diff",i,j,data[,i],data[,j],sep=".")
new<- data[,i]-data[,j]
list[[new.cols]]<-c(name)
data<-merge(data,new)
}
n= n+1
}
results<-list(data=data)
return(results)
}
As I said before the code is running very slow and has not even finished a single run through yet. Also I apologize for the beginner level coding. Also I am aware this code leaves the original data on the matrix, but I can delete it later.
Is it possible for me to use an apply function or foreach on this data?
Upvotes: 1
Views: 4865
Reputation: 42689
You can find the pairs with combn
and use apply
to create the result:
apply(combn(ncol(d), 2), 2, function(x) d[,x[1]] - d[,x[2]])
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 3.217841 -2.568691 0.0668021 -5.786532 -3.151039 2.6354931
## [2,] 2.891622 -2.455487 -0.1124344 -5.347109 -3.004056 2.3430526
## [3,] 2.046909 -2.244467 -0.5667373 -4.291376 -2.613647 1.6777297
## [4,] 1.434025 -2.099483 -1.1635060 -3.533508 -2.597531 0.9359770
## [5,] 1.068941 -1.971652 -1.6384254 -3.040593 -2.707366 0.3332266
## [6,] 1.004582 -1.626496 -1.3764430 -2.631078 -2.381025 0.2500530
You can add appropriate names with another apply
. Here the column names are very long, which impairs the formatting, but the labels tell what differences are in each column:
x <- apply(combn(ncol(d), 2), 2, function(x) d[,x[1]] - d[,x[2]])
colnames(x) <- apply(combn(ncol(d), 2), 2, function(x) paste(names(d)[x], collapse=' - '))
> x
Transportation.services - Recreational.goods.and.vehicles Transportation.services - Recreation.services
[1,] 3.217841 -2.568691
[2,] 2.891622 -2.455487
[3,] 2.046909 -2.244467
[4,] 1.434025 -2.099483
[5,] 1.068941 -1.971652
[6,] 1.004582 -1.626496
Transportation.services - Other.services Recreational.goods.and.vehicles - Recreation.services
[1,] 0.0668021 -5.786532
[2,] -0.1124344 -5.347109
[3,] -0.5667373 -4.291376
[4,] -1.1635060 -3.533508
[5,] -1.6384254 -3.040593
[6,] -1.3764430 -2.631078
Recreational.goods.and.vehicles - Other.services Recreation.services - Other.services
[1,] -3.151039 2.6354931
[2,] -3.004056 2.3430526
[3,] -2.613647 1.6777297
[4,] -2.597531 0.9359770
[5,] -2.707366 0.3332266
[6,] -2.381025 0.2500530
Upvotes: 6