ShahtajK
ShahtajK

Reputation: 127

How to detect JavaScript from html in NodeJs and stop JS rendering

Is there a way we can detect if an html file carries Javascript ? and can we stop rendering Javascript from html, in Node JS ?

I know we can stop the html rendering all together by setting the response content-type from text/html to text/plain. But I'm trying to figure out some way to stop rendering the JS only.

Kindly let me know if it's even possible, Thanks.

Upvotes: 0

Views: 172

Answers (1)

T.J. Crowder
T.J. Crowder

Reputation: 1075309

I'm guessing you're sending the file to a browser from Node.js (you talked about changing the content type header).

To do this, you'll need to:

  • Parse the file with an HTML parser (there are a few available for Node.js). Be sure it's one that normalizes input, so that (for instance), <a href="&#106;&#97;&#118;&#97;&#115;&#99;&#114;&#105;&#112;&#116;&#58;&#99;&#111;&#100;&#101;&#72;&#101;&#114;&#101;&#40;&#41;">xxx</a> is normalized to <a href="javascript:codeHere()">...</a>. (Thanks Quentin for emphasizing that!)

  • Using the resulting document model, remove:

    • any script elements

    • any onxyz attributes (onclick, onmouseover) on elements

      For instance, <div onclick="..." should be changed to <div ....

    • remove any URL attributes (like href on a elements) that use the javascript: scheme

      For instance, <a href="javascript:codeHere()" should be changed to <a href="#" or similar (if you remove href entirely, that works to, but the link will no longer automatically be a tabstop etc.).

      (This is where normalization in the parser is important.)

  • Serialize the resulting document model to HTML and send it to the browser

Upvotes: 3

Related Questions