Reputation: 3867
I've a scenario where i have complete web pages having javascript, css and html. I need to remove the script and style tags plus their contents completely. I have achieved this in PHP using the following regex:
$str = preg_replace('#<script(.*?)>(.*?)</script>#is', '', $html);
preg_replace('#<style(.*?)>(.*?)</style>#is', '', $str);
But can't get it done in javascript. I want to have the equivalent of
<script(.*?)>(.*?)</script> //in javascript
I want to replace all their occurrences within html. I have stripped out the others html tags with this
pureText.replace(/<(?:.|\n)*?>/gm, ''); //just a reference
Upvotes: 0
Views: 4168
Reputation: 38183
Don't use regex for this. It is much slower and less reliable than manipulating the DOM.
var scripts = document.getElementsByTagName('script');
var css = document.getElementsByTagName('style');
for(var i = 0; i < scripts.length; i++)
{
scripts[i].parentItem.removeChild(scripts[i]);
}
for(var j = 0; j < css.length; j++)
{
css[j].parentItem.removeChild(css[j]);
}
Upvotes: 4
Reputation: 174706
I want to have the equivalent of
<script(.*?)>(.*?)</script> //in javascript
/<script([\S\s]*?)>([\S\s]*?)<\/script>/ig
Use [\S\s]*?
instead of .*?
in your regex because javascript won't support s
modifier (DOTALL modifier). [\S\s]*?
would match any space or non-space character zero or more times non-greedily.
Upvotes: 7