Reputation: 21
I am trying to remove scripts and their content from html body and this is what I have came up until now
just_text = just_text.replace(/<\s*script[^>]*>(<\s*\/script[^>]*>|$)/ig, '');
It does not work as want to, I still get the content.
Can you please help me?
Thank you
Upvotes: 0
Views: 356
Reputation: 816384
The answer to such questions is always the same: Don't use regular expressions. Instead, parse the HTML, modify the DOM and serialize it back to HTML if you need to.
Example:
var container = document.createElement('div');
container.innerHTML = just_text;
// find and remove `script` elements
var scripts = container.getElementsByTagName('script');
for (var i = scripts.length; i--; ) {
scripts[i].parentNode.removeChild(scripts[i]);
}
just_text = container.innerHTML;
If you want to remove the script
tags from the page itself, it's basically the same:
var scripts = document.body.getElementsByTagName('script');
for (var i = scripts.length; i--; ) {
scripts[i].parentNode.removeChild(scripts[i]);
}
Upvotes: 6