ZRoss
ZRoss

Reputation: 1479

Replace special characters (dash)

I was attempting to replace what I thought was a standard dash using gsub. The code I was testing was:

gsub("-", "ABC", "reported – estimate")

This does nothing, though. I copied and pasted the dash into http://unicodelookup.com/#–/1 and it seems to be a en dash. That site provides the hex, dec etc codes for an en dash and I've been trying to replace the en dash but am not having luck. Suggestions?

(As a bonus, if you can tell me if there is a function to identify special characters that would be helpful).

I'm not sure if SO's code formatting will change the dash format so here is the dash I'm using (–).

Upvotes: 6

Views: 5087

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626927

You can replace the en-dash by just specifying it in the regex pattern.

gsub("–", "ABC", "reported – estimate")

You can match all hyphens, en- and em-dashes with

gsub("[-–—]", "ABC", "reported – estimate — more - text")

See IDEONE demo

To check if there are non-ascii characters in a string, use

> s = "plus ça change, plus c'est la même chose"
> gsub("[[:ascii:]]+", "", s, perl=T)
[1] "çê"

See this IDEONE demo

You will either get an empty result (if a string only consists of "word" characters and whitespace), or - as here - some "special" characters.

Upvotes: 6

Seekheart
Seekheart

Reputation: 1173

for special character replacement you can do a negative complement.

gsub('[^\\w]*', 'ABC', 'reported - estimate', perl = True) will replace all special characters with ABC. The [^\w] is a pattern that says anything that isn't a normal character.

Upvotes: 3

Related Questions