trevtrev84
trevtrev84

Reputation: 31

Replace all values of a dataframe in R that contain a substring

I am trying to replace all values of a dataframe that have the word "coin" in it with 0, sample dataframe looks like this:

P1 P2 P3 P4
0 3 Coins 2 1
2 4 -2 Coins 4

My first attempt was to just lapply(dataframe,function) using a function that conditionally checks if the value contains the string "coin" and then returns 0.

I'm sure there's more efficient ways to do this, but it's the best I could come up with as a beginner in R.

I am struggling with the grepl() function, which supposedly would return TRUE if the string contains the substring I am looking for. However, I am STRUGGLING to figure out why the following code is returning FALSE.

y = "-3 coins"
grepl(y,"coin",fixed=TRUE)

My question is: What am I doing wrong that this grepl is returning FALSE when "coin" is in the initial string, and is there a better way to achieve my end goal of replacing all cells that contain "coin" with 0?

Any help is very appreciated, thank you!!

Upvotes: 3

Views: 3745

Answers (5)

TarJae
TarJae

Reputation: 78917

So many solutions! Here is one more using str_contains function from sjmisc package in combination with dplyr:

library(sjmisc)
library(dplyr)
df %>% 
  rowwise() %>% 
  mutate(across(everything(), ~ifelse(str_contains(., "Coins"), "0", .)))
     P1 P2    P3       P4
  <dbl> <chr> <chr> <dbl>
1     0 0     2         1
2     2 4     0         4

Upvotes: 2

Mwavu
Mwavu

Reputation: 2217

An easy way would be to convert df into a matrix, use grepl to subset the required indices, set them to 0 then convert df back to data.frame:

df <- as.matrix(df)
df[grepl(pattern = "coins", x = df)] <- 0

df <- as.data.frame(df)

df

#>   p1 p2 p3 p4
#> 1  0  0 -2  1
#> 2  2  4  0  4

Upvotes: 1

akrun
akrun

Reputation: 886948

We may directly convert to numeric with as.numeric, which converts the elements with characters to NA which can be changed to 0

df[] <- as.numeric(as.matrix(df))
df[is.na(df)] <- 0

-output

> df
  P1 P2 P3 P4
1  0  0  2  1
2  2  4  0  4

data

df <- structure(list(P1 = c("0", "2"), P2 = c("3 Coins", "4"), P3 = c("2", 
"-2 Coins"), P4 = c("1", "4")), class = "data.frame", row.names = c(NA, 
-2L))

Upvotes: 2

Claudiu Papasteri
Claudiu Papasteri

Reputation: 2609

Here is a tidyverse solution. The string pattern is detected by stringr::str_detect that returns TRUE/FALSE and this logical value is used inside if_else. You can then use the superseded mutate_all function or the new method of using mutate along with across to transform the values.

library(tidyverse)

df <- data.frame(
  P1 = c("0", "2"),
  P2 = c("3 Coins", "4"),
  P3 = c("2", "-2 Coins"),
  P4 = c("1", "4")
)

df 
#>   P1      P2       P3 P4
#> 1  0 3 Coins        2  1
#> 2  2       4 -2 Coins  4

df %>%
  mutate(
    dplyr::across(
      .cols = everything(), 
      .fns = ~ dplyr::if_else(stringr::str_detect(.x, "Coins"), "0", .x)
    )
  )
#>   P1 P2 P3 P4
#> 1  0  0  2  1
#> 2  2  4  0  4

Created on 2022-01-21 by the reprex package (v2.0.1)

Upvotes: 2

stefan
stefan

Reputation: 123893

Using lapply you could achieve your desired result like so:

df <- data.frame(
  P1 = c(0L, 2L),
  P2 = c("3 Coins", "4"),
  P3 = c("2", "-2 Coins"),
  P4 = c(1L, 4L)
)

df[] <- lapply(df, function(x) {x[grepl("coin", tolower(x), fixed = TRUE)] <- 0; x})

df
#>   P1 P2 P3 P4
#> 1  0  0  2  1
#> 2  2  4  0  4

Upvotes: 5

Related Questions