Lee
Lee

Reputation: 3969

Javascript can't find certain html elements

I'm putting together some offline technical documentation, and so have written some javascript for very basic syntax highlighting, and now for convenience I'm using it to replace < and > characters to save me time having to manually escape them.

The problem is this works great for a lot of html tags, except for <html>, <head> and <body> blocks.

The HTML within the <code> blocks are present in the DOM, but JS doesn't seem to find them.

I understand the HTML in question is not valid, but given it is present when I view the page source, shouldn't it still be found?

function stringReplace(str,from,to) {
  if (str) return str.replace(from,to)
}
var htmlChars = [
  ["<", "&lt;"],
  [">", "&gt;"]
];

function escapeHtmlChars(elementTagName, chars) {
  var codeSections = document.getElementsByTagName(elementTagName);
  for (var i = 0; i < codeSections.length; i++) {
    var codeContent = codeSections[i].innerHTML;
    for (var j = 0; j < chars.length; j++) {
      codeContent = stringReplace(codeContent, chars[j][0], chars[j][1])
      codeSections[i].innerHTML = codeContent;
    }
  }
}

window.addEventListener("load", function() {
  console.log(
    escapeHtmlChars("code", htmlChars)
  )  
});
<code class="code-snippet"><!doctype html>
    <html>
        <head>
            <style type="text/css"></style>
        </head>
    
        <body>
    
        </body>
    </html>
    </code>

Upvotes: 0

Views: 450

Answers (2)

mplungjan
mplungjan

Reputation: 178421

Since these tags are stripped when rendered, you should use AJAX to get at the documents and convert them when you receive them.

Alternatively: Although XMP is obsolete this still works in my browser

var html = document.querySelector("xmp").textContent
console.log(html)
document.querySelector("code").innerHTML = html.replace(/<(\/)?(\w+)/g,"<br/>&lt;$1$2")
xmp { display: none }
code { white-space: pre; }
<xmp class="code-snippet">
<!doctype html>
<html>
  <head>
    <style type="text/css"></style>
  </head>
  <body>
  </body>
</html>
</xmp>
<code></code>

Upvotes: 1

Quentin
Quentin

Reputation: 944556

I understand the HTML in question is not valid, but given it is present when I view the page source, shouldn't it still be found?

No, because your JavaScript isn't interacting with the source code.

The browser reads the source code. It constructs a DOM from it (which involves a lot of error recovery rules). You then read the innerHTML which generates HTML from the DOM.

The original data isn't available because the error recovery has already been applied.

now for convenience I'm using it to replace < and > characters to save me time having to manually escape them

I suggest generating your HTML from Markdown files to save on the effort there. Alternatively, set up a Find & Replace in selection macro in your editor.

Upvotes: 2

Related Questions