Ernest Presley
Ernest Presley

Reputation: 171

Capture Column Names based on specific cell value

What I have is a dataset like this below

    A        B        C
    Yes      No       No
    No       Yes      No
    No       No       Yes
    Yes      No       Yes
    No       Yes      Yes

I am interested in creating a new column D which stores column names where cell values == Yes. The final desired output should appear like this

           A        B        C      Result
           Yes      No       No     A 
           No       Yes      No     B
           No       No       Yes    C
           Yes      No       Yes    A,C
           No       Yes      Yes    B,C

This is what i did so far, it is very clumsy.

df$d1 <- ifelse(df[,1]=="Yes", paste(colnames(df[1])),"" )
df$d2 <- ifelse(df[,2]=="Yes", paste(colnames(df[2])),"" )
df$d3 <- ifelse(df[,3]=="Yes", paste(colnames(df[3])),"" )

but I am interested in an efficient method of doing this. Any help is much appreciated.

Upvotes: 2

Views: 149

Answers (3)

Sam
Sam

Reputation: 1482

df <- data.frame(A = c("Yes","Yes","No","No","Yes"),
                 B = c("Yes","No","No","Yes","Yes"),
                 C = c("No","No","Yes","Yes","Yes"))

dlist <- vector('list', nrow(df))

for (i in 1:nrow(df)) {
t <- grep("Yes",unlist(df[i,]))
dlist[[i]] <- colnames(df[t])
}

df$result <- dlist

Upvotes: 1

Gopala
Gopala

Reputation: 10473

Here is one approach without having to create a subsetted interim matrix using which:

df <- data.frame(A = c('Yes', 'No', 'No', 'Yes', 'No'),
                 B = c('No', 'Yes', 'No', 'No', 'Yes'),
                 C = c('No', 'No', 'Yes', 'Yes', 'Yes'),
                 stringsAsFactors = FALSE)
df$Result <- apply(df, 1, function(x) paste(names(which(x == 'Yes')), collapse = ','))

Resulting output is:

    A   B   C Result
1 Yes  No  No      A
2  No Yes  No      B
3  No  No Yes      C
4 Yes  No Yes    A,C
5  No Yes Yes    B,C

Upvotes: 0

Gregor Thomas
Gregor Thomas

Reputation: 145755

First let's look where the Yes's are. this will be a logical matrix:

yes_mat = data == "Yes"

For each row, you want the names of the data frame that are Yes's, names(data)[x] where x is a row from yes_mat. Applying a function to rows of a matrix is best done with apply. And we'll paste the matching rows together, collapsing with a comma:

apply(yes_mat, 1, FUN = function(x) paste(names(data)[x], collapse = ","))
# [1] "A"   "B"   "C"   "A,C" "B,C"

Upvotes: 4

Related Questions