Reputation: 1894
In the example below, the regular expression must match everything between the script tags, including the script tags themselves but it must not select anything else.
<unmodified html content> <script> * </script> <more unmodified html>
The closest I've gotten so far is:
(<script>)[^~]*(</script>)
test markup:
<p> blah blah blah
</p> <span class="timestamp"><span class="hurrrp" id="faate_dd4dd">Nov 6, 2013</span>
<script>
if (FancyDate) FancyDate.add('derpaderp_1386447', 1385, 'MAIL_FORMAT');
</script>
</span>
<p> blah blah blah
</p> <span class="timestamp"><span class="hurrrp" id="faate_dd4dd">Nov 6, 2013</span>
<script>
if (FancyDate) FancyDate.add('derpaderp_1386447', 1385, 'MAIL_FORMAT');
</script>
</span>
Upvotes: 0
Views: 511
Reputation: 20494
You just have to make the star lazy:
(<script>)[^~]*?(</script>)
I'm sure if you wait long enough somebody will point out, "you don't parse HTML with regular expression!" But this should be just fine as long nobody's putting in your JavaScript.
Also I don't quite understand the point of the [^~]
, but maybe there is another reason I'm not aware of?
If there isn't a reason, you can use this, which will work in case somebody sneaks in a tilde:
(<script>)[\s\S]*?(</script>)
If you use the XRegExp you can turn on the (s) dot all flag and just do this:
(<script>).*?(</script>)
I was thinking about using a negated look ahead (?!</script>)
but then that wouldn't get captured in the result, so I abandoned that.
Upvotes: 2