user1885116
user1885116

Reputation: 1797

Vectorize comparison of values in dataframe

I am trying to compare the value of a parameter in each row of a dataframe with the value of the same parameter of all other rows. The result is a matrix that that is TRUE/FALSE on the intersection of each row with each row. It is pretty simple to implement this in a loop-based manner, but takes too much processing time with a large dataframe. I am blanking on a way to "vectorize" this code (use apply?) and speed up the processing code. Many thanks in advance.

The code that i use so far;

#dim matrix
adjm<- matrix(0,nrow=nrow(df),ncol=nrow(df))

#score
for(i in 1:nrow(df)){
  for(t in 1:nrow(df)){
    adjm[t,i]=df$varA[i]==df$varA[t]
  }
}

Upvotes: 0

Views: 108

Answers (2)

dickoa
dickoa

Reputation: 18437

You can use outer to vectorize your code

outer(df$varA, df$varA, "==")

For example

df <- data.frame(varA = c(1, 2, 1, 3, 4, 2))

outer(df$varA, df$varA, "==")
##       [,1]  [,2]  [,3]  [,4]  [,5]  [,6]
## [1,]  TRUE FALSE  TRUE FALSE FALSE FALSE
## [2,] FALSE  TRUE FALSE FALSE FALSE  TRUE
## [3,]  TRUE FALSE  TRUE FALSE FALSE FALSE
## [4,] FALSE FALSE FALSE  TRUE FALSE FALSE
## [5,] FALSE FALSE FALSE FALSE  TRUE FALSE
## [6,] FALSE  TRUE FALSE FALSE FALSE  TRUE

Upvotes: 4

Thomas
Thomas

Reputation: 44525

With apply:

apply(df,1,function(x) x[1] == df$varA) # `1` should be column number for `varA`

But that's not technically vectorized.

Upvotes: 1

Related Questions