Reputation: 3451
I have some text:
version of mackinnon’s “dominance approach,”
which I've read into a character vector:
> my.char.vector
[1] "version" "of" "mackinnon’s" "“dominance" "approach,”"
How can I remove double (and single) quotes, such that my.char.vector is
[1] "version" "of" "mackinnons" "dominance" "approach,"
The other question with this exact title is not, in fact, asking the same question - it's trying to print without quotes. Elements in my character vector really do contain quotes, which I'm trying to remove.
Upvotes: 0
Views: 2320
Reputation: 887851
Another option with qdap
library(qdap)
strip(mcv, char.keep=',')
#[1] "version" "of" "mackinnons" "dominance" "approach,"
Or using stringi
library(stringi)
stri_replace_all_regex(mcv, '[^[:alnum:],]+', '')
#[1] "version" "of" "mackinnons" "dominance" "approach,"
Or base R
vapply(regmatches(mcv,gregexpr('[A-Za-z,]+', mcv)), paste,
collapse="", character(1L))
#[1] "version" "of" "mackinnons" "dominance" "approach,"
mcv <- c("version","of","mackinnon’s","“dominance","approach,”")
Upvotes: 2
Reputation: 99371
You can use stringi
for this. We can use the ICU metacharacter \\P
to negate the matched values and --
to subtract the negation of the comma.
library(stringi)
mcv <- c("version", "of", "mackinnon’s", "“dominance", "approach,”")
stri_replace_all_regex(mcv, "[\\P{Ll}--,]", "")
# [1] "version" "of" "mackinnons" "dominance" "approach,"
I'm just learning ICU, but I think that's the right expression to use.
Upvotes: 2
Reputation: 226801
these are "fancy" quotes -- I cut & pasted them from the screen (if you just use SHIFT-' [or whatever it is on your keyboard] you'll just get regular " quotes ...).
mcv <- c("version","of","mackinnon’s","“dominance","approach,”")
gsub("[’”“]","",mcv)
Another possibility (seems to work on my system but might? be system/locale/etc. specific?): convert weird characters to "#", or something else safe, and then get rid of them.
gsub("#","",iconv(mcv,"latin1","ASCII","#"))
Upvotes: 1