Flimm
Flimm

Reputation: 151298

How do I get innerHTML of document before parsing?

I'm thinking of developing an HTML linter, that runs client-side. This linter would need access to the HTML of the document as served exactly by the server.

Here's an example:

<!DOCTYPE html>
<p>One
<p>Two
<script>
console.log(document.documentElement.innerHTML); // this prints the parsed HTML, not the HTML as is
</script>

This prints to the console:

<head></head><body><p>One
</p><p>Two
<script>
console.log(document.documentElement.innerHTML); // this prints the parsed HTML, not the HTML as is
</script></p></body>

As you can see, this is not the output I wanted. I wanted to have the output be exactly the same as the HTTP response, byte for byte, including the doctype declaration, and without the added tags like <head></head>. Is this possible? Is this possible in a browser extension?

Upvotes: 0

Views: 92

Answers (2)

Igor Bykov
Igor Bykov

Reputation: 2822

I think the only way would be to re-fetch the document. If you want to fetch the HTML of the current page you could do something like this:

fetch('.')
  .then(res => res.text())
  .then(html => console.log(html));

This way it will be untouched HTML sent from the server.

If you want, you can also process it later on, with something like DOMParser.

Upvotes: 2

Lionel Rowe
Lionel Rowe

Reputation: 5956

You can do this easily enough by re-fetching the page's content, if you don't mind asynchronicity and an additional HTTP request:

;(async () => {
    const res = await fetch('.')

    const text = await res.text()

    console.log(text)
})()

Upvotes: 1

Related Questions