Omry Atia
Omry Atia

Reputation: 2443

create a logical column for whether a row contains a string in any column

I have the following data frame:

df <- structure(list(x = c("cc", "aa", "BB", "dd"), y = c("ee", "dd",
"ff", "gg"), z = c("AA", "gg", "bb", "dd")), row.names = c(NA,
-4L), class = c("tbl_df", "tbl", "data.frame"))

I would like to create a binary column indicating whether each row contains "aa" (case insensitive) in any column. So in this case the first two values will be TRUE, and the last true will be FALSE. How can I do this using dplyr? all the answers explain how to filter those rows, rather than how to book-keep them

Upvotes: 0

Views: 276

Answers (3)

akrun
akrun

Reputation: 887118

We could use a vectorized option with if_any

library(dplyr)
library(stringr)
df %>%
     mutate(xyz = +(if_any(everything(), 
         ~ str_detect(., regex('aa', ignore_case = TRUE)))))

-output

# A tibble: 4 x 4
  x     y     z       xyz
  <chr> <chr> <chr> <int>
1 cc    ee    AA        1
2 aa    dd    gg        1
3 BB    ff    bb        0
4 dd    gg    dd        0

Upvotes: 2

Anoushiravan R
Anoushiravan R

Reputation: 21918

We can also use this:

library(dplyr)

df %>%
  rowwise() %>%
  mutate(xyz = +any(grepl("aa", cur_data(), ignore.case = TRUE)))

# A tibble: 4 x 4
# Rowwise: 
  x     y     z       xyz
  <chr> <chr> <chr> <int>
1 cc    ee    AA        1
2 aa    dd    gg        1
3 BB    ff    bb        0
4 dd    gg    dd        0

And also in base R we can do this:

Reduce(`+`, apply(df, 1, \(x) +(grepl("aa", x, , ignore.case = TRUE))) |>
         t() |>
         as.data.frame()) -> df$xyz

# A tibble: 4 x 4
# Rowwise: 
  x     y     z       xyz
  <chr> <chr> <chr> <int>
1 cc    ee    AA        1
2 aa    dd    gg        1
3 BB    ff    bb        0
4 dd    gg    dd        0

Upvotes: 3

det
det

Reputation: 5232

library(tidyverse)

df %>%
  mutate(flag = pmap_lgl(., ~"aa" %in% str_to_lower(c(...))))

or with rowwise:

df %>%
  rowwise() %>%
  mutate(flag = "aa" %in% str_to_lower(c_across(everything())))

with data.table:

setDT(df)[, flag := transpose(.SD) %>% map_lgl(~"aa" %in% str_to_lower(.x))]

(transpose is from data.table package)

Upvotes: 3

Related Questions