Samantha J T Star
Samantha J T Star

Reputation: 32788

How can I get the body contents out of a variable containing HTML?

I have a variable htmlSource containing HTML code like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml"> 
<head> 
<title>IIS 8.0 Detailed Error - 404.0 - Not Found</title> 


</head> 
<body>xxx some code here yy</body> 
</html>

How can I create a new variable htmlBodyOnly that contains only "xxx some code here yy". If possible I would like to do this with a regular expression. I am just not sure how to exclude the start and end using a regex or something similar.

Sorry but I don't have jQuery to use to help. I am working just on a javascript variable. Not working on the DOM.

Upvotes: 0

Views: 1236

Answers (3)

0xcaff
0xcaff

Reputation: 13681

You can use a DOMParser to parse the html and extract the content of the body. See this SO question: Converting HTML string into DOM elements?

var parser = new DOMParser()
var doc = parser.parseFromString(stringToParse, "text/html")
console.log(doc.body.innerHTML)

Here is a Fiddle!

Upvotes: 1

Shmoopy
Shmoopy

Reputation: 649

This is ugly, but you can keep it as a string with this method:

htmlsource.substring(htmlsource.indexOf("<body>")+6, htmlsource.indexOf("</body>"))

The +6 is because the string "<body>" has 6 characters and the indexOf method returns the index of the first character in the string to search for.

Here's proof that it works given your example: http://jsfiddle.net/9wBkf/

This assumes that the body tag will have no attributes i.e. <body class="myClass>

Upvotes: 2

dejakob
dejakob

Reputation: 2092

I do not know which regular expression you can use for that, but I think I know an alternative solution. You can also 'convert' your var to a DOM-object and then read the body-child.

Converting HTML string into DOM elements?

Upvotes: 0

Related Questions