Timothée HENRY
Timothée HENRY

Reputation: 14614

R Given a column, find duplicates of this column in a data.frame

Given the following data:

questionTagMatrix <- data.frame( question1=c("0","1","0"), question2=c("1","0", "0"), question3=c("0","1","0"), question4=c("0","1","1")  )
rownames(questionTagMatrix)[1] <- "php"
rownames(questionTagMatrix)[2] <- "html"
rownames(questionTagMatrix)[3] <- "javascript"

newQuestion <- data.frame( newquestion=c("0","1","0") )
rownames(newQuestion)[1] <- "php"
rownames(newQuestion)[2] <- "html"
rownames(newQuestion)[3] <- "javascript"

How do I find all columns of questionTagMatrix equal to newQuestion?

Upvotes: 1

Views: 61

Answers (2)

agstudy
agstudy

Reputation: 121608

A vectorized solution using colSums:

 questionTagMatrix[,colSums(questionTagMatrix == newQuestion)
                    ==nrow(questionTagMatrix)]

          question1 question3
php                0         0
html               1         1
javascript         0         0

PS newQuestion is a vector here :

newQuestion =c("0","1","0") ## not data.frame( newquestion=c("0","1","0") )

To get only questions names:

names(questionTagMatrix)[colSums(questionTagMatrix == newQuestion)
+                   ==nrow(questionTagMatrix)]
[1] "question1" "question3"

Upvotes: 0

Sven Hohenstein
Sven Hohenstein

Reputation: 81733

You can use apply to find the columns:

questionTagMatrix[apply(questionTagMatrix, 2, function(x) 
                                               all(x == as.matrix(newQuestion)))]

All columns of questionTagMatrix are compared with newQuestion. The result:

#            question1 question3
# php                0         0
# html               1         1
# javascript         0         0

Upvotes: 2

Related Questions