Thalecress
Thalecress

Reputation: 3451

Remove quotes from a character vector in R?

I have some text:

version of mackinnon’s “dominance approach,”

which I've read into a character vector:

> my.char.vector
[1] "version" "of" "mackinnon’s" "“dominance" "approach,”" 

How can I remove double (and single) quotes, such that my.char.vector is

[1] "version" "of" "mackinnons" "dominance" "approach," 

The other question with this exact title is not, in fact, asking the same question - it's trying to print without quotes. Elements in my character vector really do contain quotes, which I'm trying to remove.

Upvotes: 0

Views: 2320

Answers (4)

Crops
Crops

Reputation: 5154

Try this.

gsub("[^[:print:]]", "", my.char.vector)

Upvotes: 1

akrun
akrun

Reputation: 887851

Another option with qdap

library(qdap)
strip(mcv, char.keep=',')
#[1] "version"    "of"         "mackinnons" "dominance"  "approach," 

Or using stringi

library(stringi)
stri_replace_all_regex(mcv, '[^[:alnum:],]+', '')
#[1] "version"    "of"         "mackinnons" "dominance"  "approach," 

Or base R

 vapply(regmatches(mcv,gregexpr('[A-Za-z,]+', mcv)), paste,
                                    collapse="", character(1L))
#[1] "version"    "of"         "mackinnons" "dominance"  "approach," 

data

mcv <- c("version","of","mackinnon’s","“dominance","approach,”")

Upvotes: 2

Rich Scriven
Rich Scriven

Reputation: 99371

You can use stringi for this. We can use the ICU metacharacter \\P to negate the matched values and -- to subtract the negation of the comma.

library(stringi)
mcv <- c("version", "of", "mackinnon’s", "“dominance", "approach,”")
stri_replace_all_regex(mcv, "[\\P{Ll}--,]", "")
# [1] "version"    "of"         "mackinnons" "dominance"  "approach," 

I'm just learning ICU, but I think that's the right expression to use.

Upvotes: 2

Ben Bolker
Ben Bolker

Reputation: 226801

these are "fancy" quotes -- I cut & pasted them from the screen (if you just use SHIFT-' [or whatever it is on your keyboard] you'll just get regular " quotes ...).

mcv <- c("version","of","mackinnon’s","“dominance","approach,”")
gsub("[’”“]","",mcv)

Another possibility (seems to work on my system but might? be system/locale/etc. specific?): convert weird characters to "#", or something else safe, and then get rid of them.

gsub("#","",iconv(mcv,"latin1","ASCII","#"))

Upvotes: 1

Related Questions