Reputation: 1797
I am trying to compare the value of a parameter in each row of a dataframe with the value of the same parameter of all other rows. The result is a matrix that that is TRUE/FALSE on the intersection of each row with each row. It is pretty simple to implement this in a loop-based manner, but takes too much processing time with a large dataframe. I am blanking on a way to "vectorize" this code (use apply?) and speed up the processing code. Many thanks in advance.
The code that i use so far;
#dim matrix
adjm<- matrix(0,nrow=nrow(df),ncol=nrow(df))
#score
for(i in 1:nrow(df)){
for(t in 1:nrow(df)){
adjm[t,i]=df$varA[i]==df$varA[t]
}
}
Upvotes: 0
Views: 108
Reputation: 18437
You can use outer
to vectorize your code
outer(df$varA, df$varA, "==")
For example
df <- data.frame(varA = c(1, 2, 1, 3, 4, 2))
outer(df$varA, df$varA, "==")
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] TRUE FALSE TRUE FALSE FALSE FALSE
## [2,] FALSE TRUE FALSE FALSE FALSE TRUE
## [3,] TRUE FALSE TRUE FALSE FALSE FALSE
## [4,] FALSE FALSE FALSE TRUE FALSE FALSE
## [5,] FALSE FALSE FALSE FALSE TRUE FALSE
## [6,] FALSE TRUE FALSE FALSE FALSE TRUE
Upvotes: 4
Reputation: 44525
With apply
:
apply(df,1,function(x) x[1] == df$varA) # `1` should be column number for `varA`
But that's not technically vectorized.
Upvotes: 1