Reputation: 195
Is there an easy way to remove HTML tags from a character string in R?
Currently I'm extracting out survey data from an XML document and for the title of the question have HTML from the survey design in it, like this.
"Why did you give this performance question a low score?<br />"
Any way to easily remove the <br />
?
Any help would be appreciated.
Upvotes: 4
Views: 5613
Reputation: 176728
Take a look at ?gsub
and ?regex
. Here's some simple code to remove the <br />
, but it won't work for all potential HTML tags.
> string <- "Why did you give this performance question a low score?<br />"
> gsub("<.*/>","",string)
[1] "Why did you give this performance question a low score?"
Upvotes: 4