Sebastian Zeki
Sebastian Zeki

Reputation: 6874

How to remove ascii characters from a string in r

Perhaps I dont understand the nuances of ascii but I am failing to remove encodings from a string.

The input string is:

mystring<-"complications:  noneco-morbidity:nil \\x0c\\\\xd6\\p__"

My desired output is:

"complications:  noneco-morbidity:nil __"

My attempt is:

iconv(x, "latin1", "ASCII", sub = "")

but nothing is removed

Upvotes: 1

Views: 217

Answers (2)

Theo
Theo

Reputation: 575

Below is not a clean solution. But still might be useful.

gsub("x0c|xd6|\\p|\\\\","", mystring)

Upvotes: 0

R. Schifini
R. Schifini

Reputation: 9313

Use the following pattern as a regular expression with gsub:

"[\\x00-\\x7F]+"

This expression matches any non-ASCII character and gsub removes it (replacement="")

Example:

gsub(pattern = "[\\x00-\\x7F]+", replacement = "", "complications:  noneco-morbidity:nil \\x0c\\\\xd6\\p__")

[1] "complications  noneco-morbiditynil cdp__"

Upvotes: 1

Related Questions