Anubhav Dikshit
Anubhav Dikshit

Reputation: 1829

remove a character from the entire data frame

I have a dataframe with various columns, Some of the data within some columns contain double quotes, I want to remove these, for eg:

ID    name   value1     value2
"1     x     a,"b,"c     x"
"2     y     d,"r"       z"

I want this to look like this:

ID    name   value1    value2
1     x      a,b,c      x
2     y      d,r        z

Upvotes: 14

Views: 33073

Answers (4)

Smerla
Smerla

Reputation: 231

A dplyr solution (based on the suggestion of @akrun in one of the comments).

df1 <-  structure(list(ID = c("\"1", "\"2"), name = c("x", "y"),
                       value1 = c("a,\"b,\"c", "d,\"r\""),
                       value2 = c("x\"", "z\"")),
                      .Names = c("ID", "name", "value1", "value2"), class = "data.frame", row.names = c(NA, -2L))

df1 <- df1 %>% dplyr::mutate(across(everything(), stringr::str_remove_all, pattern = '"'))

Upvotes: 2

JohnBar
JohnBar

Reputation: 11

To remove $ you have to escape it \\\$. Try:

df[] <- lapply(df, gsub, pattern="\\\$", replacement="")

Upvotes: 1

akrun
akrun

Reputation: 886938

I would use lapply to loop over the columns and then replace the " using gsub.

df1[] <- lapply(df1, gsub, pattern='"', replacement='')
df1
#  ID name value1 value2
#1  1    x  a,b,c      x
#2  2    y    d,r      z

and if need the class can be changed with type.convert

df1[] <- lapply(df1, type.convert)

data

df1 <-  structure(list(ID = c("\"1", "\"2"), name = c("x", "y"),
value1 = c("a,\"b,\"c", 
"d,\"r\""), value2 = c("x\"", "z\"")), .Names = c("ID", "name", 
"value1", "value2"), class = "data.frame", row.names = c(NA, -2L))

Upvotes: 24

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520888

One option would be to use apply() along with the gsub() function to remove all double quotation marks:

df <- data.frame(ID=c("\"1", "\"2"),
                 name=c("x", "y"),
                 value1=c("a,\"b,\"c", "d,\"r\""),
                 value2=c("x\"", "z\""))

df <- data.frame(apply(df, 2, function(x) {
                                  x <- gsub("\"", "", x)
                              })

> df
  ID name value1 value2
1  1    x  a,b,c      x
2  2    y    d,r      z

Upvotes: 2

Related Questions