Reputation: 111
Need to find a single <script> condition </script>
that contains condition.
The example below contains 4 tags, i need to match the second tag that contains condition and discard the others. Starts with <script>
. Before the condition could be space or new line, and than the condition like if (window.location.href == bar) { }
and than could be space or new line, and the end </script>
.
<script> <!-- discard --->
other stuff
not to be found
</script>
<script> <!-- MATCH --->
if (window.location.href == bar) {
do something
}
</script>
<script> <!-- discard --->
other stuff
not to be found
</script>
<script> <!-- discard --->
other stuff
not to be found
</script>
Thanks in advance
Upvotes: 1
Views: 49
Reputation: 15141
Here you should use DOMDocument
instead of Regex
for matching tags and its required content.
<?php
ini_set('display_errors', 1);
$object= new DOMDocument();
$object->loadHTML('<html><body><script> <!-- discard --->
other stuff
not to be found
</script>
<script> <!-- MATCH --->
if (window.location.href == bar) {
do something
}
</script>
<script> <!-- discard --->
other stuff
not to be found
</script>
<script> <!-- discard --->
other stuff
not to be found
</script></body></html>');
$tagsToRemove=array();
foreach($object->getElementsByTagName("script") as $element)
{
if($element instanceof DOMElement)
{
if(!preg_match("/if\s*\(/i", $element->nodeValue))
{
$tagsToRemove[]=$element;
}
}
}
foreach($tagsToRemove as $element)
{
$element->parentNode->removeChild($element);
}
echo $object->saveHTML();
Upvotes: 1