OinkOink
OinkOink

Reputation: 63

R programming - How to remove special characters from a data set?

I have a data set that contains strings and special characters like the one below can be found in the data set.

Special character

How do I remove special characters like the above from my data set?

Upvotes: 0

Views: 412

Answers (1)

snaut
snaut

Reputation: 2535

Use regular expressions to remove unwanted characters, for example:

dataset$textcolumn <- gsub("[^\\w\\s]", "", dataset$textcolumn, perl=TRUE)

to remove everything except word characters and spaces. To do more complex replacements look into the help topic ?regexp.

Also look into the encoding (Encoding and iconv are helpful here.), maybe the text is correct but the wrong encoding is assumed.

Upvotes: 3

Related Questions