richpiana
richpiana

Reputation: 421

Remove special apostrophe in R

I am doing some text mining and I would like to remove the apostrophe " from my text (delete it). I tried to use gsub as follow but it does not work

text <- "\"branch"

removeSpecialChars <- function(x){
     result <- gsub('"',x)
     return(result)
}

without <- removeSpecialChars(text)

The desired Output would be branch and not "branch. Thanks for your help

EDIT to go further (i am trying to clean a text).

The Input is a list conatining a lot of different string. For example

Input <- list(c("e","b", "stackoverflow", "\"branch"))

cleanCorpus <- function(corpus){
  corpus.tmp <- tm_map(corpus, removePunctuation,preserve_intra_word_dashes = TRUE)

  removeSpecialChars <- function(x){
    result <- gsub('"', "",x)
    return(result)
  }
  corpus.tmp <- removeSpecialChars(corpus.tmp)

  corpus.tmp <- tm_map(corpus.tmp, stripWhitespace)
  corpus.tmp <- tm_map(corpus.tmp, content_transformer(tolower))
  corpus.tmp <- tm_map(corpus.tmp, removeWords, stopwords("english"))
  return(corpus.tmp)
}
result <- cleanCorpus(Input)

Upvotes: 2

Views: 3848

Answers (2)

akrun
akrun

Reputation: 887691

We need to use the replacement

gsub('"', "", text)
#[1] "branch"

data

text <- "\"branch"

Upvotes: 3

abhiieor
abhiieor

Reputation: 3554

result <- gsub("\"",text) will work for you. You need to override that " by using .

Upvotes: 1

Related Questions