Lorenzo Benassi
Lorenzo Benassi

Reputation: 621

Compare rows of a data frame with a matrix rows in R

I have created a matrix like this:

> head(matrix)
     Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10 Var11
[1,] "0"  "0"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[2,] "1"  "0"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[3,] "0"  "1"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[4,] "1"  "1"  "1"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[5,] "0"  "0"  "2"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"  
[6,] "1"  "0"  "2"  "0"  "1"  "1"  "0"  "0"  "0"  "0"   "NA"

Now, I want to compare the matrix above with the following data frame:

> head(df)
       cod Var11 Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10     Var12
1  C000354     B    1    1    4    0    1    2    0    0    0     1  51520.72
2  C000404     A    1    0    1    0    4    4    0    0    1     1  21183.25
3  C000444     A    1    0    4    1    3    3    0    0    0     1  67504.74
4  C000480     A    1    1    2    0    2    3    0    0    1     1  26545.92
5  C000983     C    1    0    1    0    3    4    0    0    0     0  10379.37
6  C000985     C    1    0    3    1    3    4    0    0    0     0  18660.99

Matrix contains all possible combinations of the variables Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10, so basically when a row of df (only column from VAR1 to VAR10) match with a row of matrix and this row in df had a Var12>=90000, I would like it to be written "A" in corresponding column VAR11 of matrix.

I have tried with this:

for (i in 1 : nrow(matrix)) {
  for (j in 1 : 10) {
    ifelse(matrix[i,j]==df[,(j+2)]
           && df$Var12[] >= 90000,
           matrix[i,"Var11"] <- "A",
           matrix[i,"Var11"] <- "NA")
  }
}

But this writes NA in all rows of matrix.

Does anyone know why this happen or how to solve it?

Thanks in advance.

Upvotes: 0

Views: 384

Answers (1)

Jean
Jean

Reputation: 1490

I don't understand why you used 1:10 and j+2 in your loop.

#Some dummy data
col_to_match<-paste0("V",1:10)
set.seed(123)
mat <- cbind(matrix(sample(0:4, 100, replace=TRUE), ncol=10), "NA")
colnames(mat)<-c(col_to_match,"V11")
set.seed(123)
df<- data.frame("cod"=paste0("C",1:20), "V12"= runif(20,min=88000,max=95000))
set.seed(1)
df <- cbind(df, rbind(mat[3:10,col_to_match], matrix(sample(0:4, 120, replace=TRUE), ncol=10))  )

From the dummy data, we expect the rows of the matrix c(3:10)[df[1:8,"V12"]>=90000] to match the dummy data. Those are rows 3 4 5 6 7 9 10.

Run the following to check for every row in matrix, find whether there are any matching rows in df, and whether the V12 value is greater than 90000.

for(i in 1:nrow(mat)){
  hasMatch<-any(sapply(1:nrow(df), function(j) all( df[j,col_to_match] == mat[i, col_to_match] ) && df[j,"V12"]>=90000 ))
  if(hasMatch) mat[i, "V11"]<-"A"
}

Output

 > mat
      V1  V2  V3  V4  V5  V6  V7  V8  V9  V10 V11 
 [1,] "1" "4" "4" "4" "0" "0" "3" "3" "1" "0" "NA"
 [2,] "3" "2" "3" "4" "2" "2" "0" "3" "3" "3" "NA"
 [3,] "2" "3" "3" "3" "2" "3" "1" "3" "2" "1" "A" 
 [4,] "4" "2" "4" "3" "1" "0" "1" "0" "3" "3" "A" 
 [5,] "4" "0" "3" "0" "0" "2" "4" "2" "0" "1" "A" 
 [6,] "0" "4" "3" "2" "0" "1" "2" "1" "2" "0" "A" 
 [7,] "2" "1" "2" "3" "1" "0" "4" "1" "4" "3" "A" 
 [8,] "4" "0" "2" "1" "2" "3" "4" "3" "4" "0" "NA"
 [9,] "2" "1" "1" "1" "1" "4" "3" "1" "4" "2" "A" 
[10,] "2" "4" "0" "1" "4" "1" "2" "0" "0" "2" "A" 

Upvotes: 1

Related Questions