user7616021
user7616021

Reputation:

Find column index where row value is greater than zero in R

I have data set as follows:

    A   B   C
R1  1   0   1
R2  0   1   0
R3  0   0   0

I want to add another column in data set named index such that it gives column names for each row where the column value is greater than zero. The result I want is as follows:

    A   B   C   Index
R1  1   0   1   A,C
R2  0   1   0   B
R3  0   0   0   NA

Upvotes: 1

Views: 4459

Answers (2)

missuse
missuse

Reputation: 19716

Here is one approach using base:

use apply to go over rows, find elements that are equal to one and paste together the corresponding column names:

df$Index <- apply(df, 1, function(x) paste(colnames(df)[which(x == 1)], collapse = ", "))

df$Index <- crate a new column called Index where the result of the operation will be held

apply - applies a function over rows and/or columns of a matrix/data frame

1 - specify that the function should be applied to rows (2 - means over columns)

function(x) an unnamed function which is further defined - x corresponds to each row

which(x == 1) which elements of a row are equal to 1 output is TRUE/FALSE

colnames(df) - names of the columns of the data frame

colnames(df)[which(x == 1] - subsets the column names which are TRUE for the expression which(x == 1)

paste with collapse = ", " - collapse a character vector (in this case a vector of column names that we acquired before) into a string where each element will be separated by ,.

now replace empty entries with NA

df$Index[df$Index == ""] <- NA_character_

here is how the output looks like

#output
  sample A B C Index
1     R1 1 0 1  A, C
2     R2 0 1 0     B
3     R3 0 0 0  <NA>

data:

structure(list(sample = structure(1:3, .Label = c("R1", "R2", 
"R3"), class = "factor"), A = c(1L, 0L, 0L), B = c(0L, 1L, 0L
), C = c(1L, 0L, 0L)), .Names = c("sample", "A", "B", "C"), class = "data.frame", row.names = c(NA, 
-3L))

Upvotes: 2

s_baldur
s_baldur

Reputation: 33508

Slightly different flavored apply()solution:

df$index <- apply(df, 1, function(x) ifelse(any(x), toString(names(df)[x == 1]), NA))

   A B C index
R1 1 0 1  A, C
R2 0 1 0     B
R3 0 0 0  <NA>

data:

df <- structure(
  list(
    A = c(1L, 0L, 0L), 
    B = c(0L, 1L, 0L),
    C = c(1L, 0L, 0L)
  ), 
  row.names = paste0('R', 1:3), 
  class = "data.frame"
)

Upvotes: 1

Related Questions