Economist
Economist

Reputation: 183

Replace all non NAs across multiple columns with a specific string

Given the following example dataset:

df <- structure(list(Id = 1:10,
                     Department = c("A", "B", "A", "C",
                                    "A", "B", "B", "C", "D", "A"),
                     Q1 = c("US", NA, NA, "US",
                            NA, "US", NA, "US", NA, "US"),
                     Q2 = c("Comp B", NA, NA,
                            "Comp B", "Comp B", NA, "Comp B", NA, "Comp B", "Comp B"), 
                     Q3 = c(NA, NA, NA, NA, NA, NA, "Comp C", NA, NA, NA),
                     Q4 = c(NA, "Comp D", NA, "Comp D", NA, NA, NA, NA, "Comp D", NA),
                     Sales = c(10, 23, 12, 5, 5, 76, 236, 4, 3, 10)),
                row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))

Is there a way to replace all non NA values in columns Q2:Q4 with, for instance, the word "Competitor" all at once? I know how to do string_replace on individual columns but with over 100 columns, with different words to be replaced in each, I'm hoping there is a quicker way. I tried messing around with various versions of mutate(across(Q2:Q4, ~str_replace(.x, !is.na, "Competitor"))), which I modelled after mutate(across(Q2:Q4, ~replace_na(.x, 0))) but that didn't work. I'm still not clear on the syntax on across except for the most simple operations and don't even know if it is applicable here.

Thanks!

Upvotes: 2

Views: 772

Answers (2)

Anoushiravan R
Anoushiravan R

Reputation: 21908

Here is another option:

library(dplyr)
library(purrr)

df %>%
  mutate(pmap_df(select(df, Q2:Q4), ~ replace(c(...), !is.na(c(...)), "Competitor")))

# A tibble: 10 x 7
      Id Department Q1    Q2         Q3         Q4         Sales
   <int> <chr>      <chr> <chr>      <chr>      <chr>      <dbl>
 1     1 A          US    Competitor NA         NA            10
 2     2 B          NA    NA         NA         Competitor    23
 3     3 A          NA    NA         NA         NA            12
 4     4 C          US    Competitor NA         Competitor     5
 5     5 A          NA    Competitor NA         NA             5
 6     6 B          US    NA         NA         NA            76
 7     7 B          NA    Competitor Competitor NA           236
 8     8 C          US    NA         NA         NA             4
 9     9 D          NA    Competitor NA         Competitor     3
10    10 A          US    Competitor NA         NA            10

Upvotes: 2

akrun
akrun

Reputation: 887118

str_replace is for replacing substring. The second argument with is.na is not be called i.e is.na is a function. We could use replace to replace the entire non-NA element

library(dplyr)
df1 <- df %>%
     mutate(across(Q2:Q4, ~ replace(., !is.na(.), "Competitor")))

-output

# A tibble: 10 x 7
      Id Department Q1    Q2         Q3         Q4         Sales
   <int> <chr>      <chr> <chr>      <chr>      <chr>      <dbl>
 1     1 A          US    Competitor <NA>       <NA>          10
 2     2 B          <NA>  <NA>       <NA>       Competitor    23
 3     3 A          <NA>  <NA>       <NA>       <NA>          12
 4     4 C          US    Competitor <NA>       Competitor     5
 5     5 A          <NA>  Competitor <NA>       <NA>           5
 6     6 B          US    <NA>       <NA>       <NA>          76
 7     7 B          <NA>  Competitor Competitor <NA>         236
 8     8 C          US    <NA>       <NA>       <NA>           4
 9     9 D          <NA>  Competitor <NA>       Competitor     3
10    10 A          US    Competitor <NA>       <NA>          10

Or in base R

nm1 <- grep("^Q[2-4]$", names(df), value = TRUE)
df[nm1][!is.na(df[nm1])] <- "Competitor"

Upvotes: 3

Related Questions