user240968
user240968

Reputation:

Removing the <script> elements of an HTML

I'm using Ruby, with the Nokogiri module, and i want to get the content of the body without the script elements.

Nokogiri parse uses XPATH or CSS 3.0. XPATH i really dont understand, and i can't find the CSS selector to achieve my goals.

Upvotes: 3

Views: 2874

Answers (1)

chlb
chlb

Reputation: 203

I don't think such selection is possible with XPath.

I'm not that familiar with Ruby or Nokogiri, but based on answers to a similar question, you might want to try selecting all script elements from the HTML document and removing them.

doc = Nokogiri::HTML(your_html)
doc.xpath("//script").remove

Adjust accordingly.

Upvotes: 8

Related Questions