Silhouettes
Silhouettes

Reputation: 175

How to add new variable to existing data frame based on condition in multiple variables in R?

In R, I have a data set with multiple columns in which the words "true" and "false" occur, but in random places in these variables id1 to id4. The variables id1 and id4 also have some missing values (NAs).

id1 <- c('abc', 'false', 198,201)
id2 <- c(763,723,'true',323)
id3 <- c('true', 'def', 223,'hij')
id4 <- c(627,376,237,'false')

df1 <- data.frame(id1,id2,id3,id4)

I would like to have a variable "id5" added to my data frame which denotes a true or false for the specific row. How would I best do this?

Desired result:

    id1   id2  id3   id4    id5
1   abc   763  true  627    true
2   false 723  def   376    false
3   198   true 223   237    true
4   201   323  hij   false  false

Upvotes: 1

Views: 52

Answers (2)

akrun
akrun

Reputation: 887951

In base R, we can use Reduce

df1$id5 <- Reduce(`|`, lapply(df1, `==`, "true"))
df1$id5
#[1]  TRUE FALSE  TRUE FALSE

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389325

Since one of the value is always present in the data we can use rowSums.

df1$id5 <- rowSums(df1 == 'true', na.rm = TRUE) > 0
df1

#    id1  id2  id3   id4   id5
#1   abc  763 true   627  TRUE
#2 false  723  def   376 FALSE
#3   198 true  223   237  TRUE
#4   201  323  hij false FALSE

We can also use row-wise apply :

apply(df1 == 'true', 1, any, na.rm = TRUE)

Upvotes: 2

Related Questions