Find column index where row value is greater than zero in R

Question

I have data set as follows:

    A   B   C
R1  1   0   1
R2  0   1   0
R3  0   0   0

I want to add another column in data set named index such that it gives column names for each row where the column value is greater than zero. The result I want is as follows:

    A   B   C   Index
R1  1   0   1   A,C
R2  0   1   0   B
R3  0   0   0   NA

missuse · Accepted Answer

Here is one approach using base:

use apply to go over rows, find elements that are equal to one and paste together the corresponding column names:

df$Index <- apply(df, 1, function(x) paste(colnames(df)[which(x == 1)], collapse = ", "))

df$Index <- crate a new column called Index where the result of the operation will be held

apply - applies a function over rows and/or columns of a matrix/data frame

1 - specify that the function should be applied to rows (2 - means over columns)

function(x) an unnamed function which is further defined - x corresponds to each row

which(x == 1) which elements of a row are equal to 1 output is TRUE/FALSE

colnames(df) - names of the columns of the data frame

colnames(df)[which(x == 1] - subsets the column names which are TRUE for the expression which(x == 1)

paste with collapse = ", " - collapse a character vector (in this case a vector of column names that we acquired before) into a string where each element will be separated by ,.

now replace empty entries with NA

df$Index[df$Index == ""] <- NA_character_

here is how the output looks like

#output
  sample A B C Index
1     R1 1 0 1  A, C
2     R2 0 1 0     B
3     R3 0 0 0

data:

structure(list(sample = structure(1:3, .Label = c("R1", "R2", 
"R3"), class = "factor"), A = c(1L, 0L, 0L), B = c(0L, 1L, 0L
), C = c(1L, 0L, 0L)), .Names = c("sample", "A", "B", "C"), class = "data.frame", row.names = c(NA, 
-3L))

Find column index where row value is greater than zero in R

Answers (2)

Related Questions