user105833
user105833

Reputation: 39

Replacing a special character does not work with gsub

I have a table with many strings that contain some weird characters that I'd like to replace with the "original" ones. Ä became ä, ö became ö, so I replace each ö with an ö in the text. It works, however, ß became à < U+009F> and I am unable to replace it...

# Works just fine:
gsub('ö', 'REPLACED', "Testing string ö")


# this does not work
gsub("Ã<U+009F>", "REPLACED", "Testing string Ã<U+009F> ")

# this does not work as well...
gsub("â<U+0080><U+0093>", "REPLACED", "Testing string â<U+0080><U+0093> ")

How do I tell R to replace These parts with some letter I want to insert?

Upvotes: 0

Views: 676

Answers (2)

akrun
akrun

Reputation: 887118

As there are metacharacters (+ - to signify one or more), in order to evaluate it literally either escape (as @boski mentioned in the solution) or use fixed = TRUE

sub("Ã<U+009F>", "REPLACED", "Testing string Ã<U+009F> ", fixed = TRUE)
#[1] "Testing string REPLACED "

Upvotes: 1

boski
boski

Reputation: 2467

You have to escape the + symbol, as it is a regex command.

> gsub("Ã<U\\+009F>", "REPLACED", "Testing string Ã<U+009F> ")
[1] "Testing string REPLACED "

> gsub("â<U\\+0080><U\\+0093>", "REPLACED", "Testing string â<U+0080><U+0093> ")
[1] "Testing string REPLACED "

Upvotes: 0

Related Questions