Cam B
Cam B

Reputation: 195

Remove HTML tags from string (R Programming)

Is there an easy way to remove HTML tags from a character string in R?

Currently I'm extracting out survey data from an XML document and for the title of the question have HTML from the survey design in it, like this.

"Why did you give this performance question a low score?<br />"

Any way to easily remove the <br />?

Any help would be appreciated.

Upvotes: 4

Views: 5613

Answers (1)

Joshua Ulrich
Joshua Ulrich

Reputation: 176728

Take a look at ?gsub and ?regex. Here's some simple code to remove the <br />, but it won't work for all potential HTML tags.

> string <- "Why did you give this performance question a low score?<br />"
> gsub("<.*/>","",string)
[1] "Why did you give this performance question a low score?"

Upvotes: 4

Related Questions