Reputation: 722
I'm using PHP to get all the "script" tags from web pages, and then appending text after the </script> that is not always valid html. Because it's not always valid markup I can't just use appendchild/replacechild to add that information, unless I'm misunderstanding how replacechild works.
Anyway, when I do
$script_tags = $doc->getElementsByTagName('script');
$l = $script_tags->length;
for ($i = $l - 1; $i > -1; $i--)
$script_tags_string = $doc->saveXML($script_tags->item($i));
This puts "<![CDATA[" and "]]>" around the contents of the script tag. How can I disable this? Please don't tell me to just delete it afterwards, that's what I'm going to do if I can't find a solution for this.
Upvotes: 3
Views: 1869
Reputation: 275
One way I've found to fix this:
Before echoing the document, make a loop around all script tags, and use str_replace for "<", ">" to some string, make sure to only use that string inside script tags. Then, use the method saveXML() in a variable, and finally use str_replace replacing "STRING" to "<" or ">"
Here is the code:
<?php
//First loop
foreach($dom->getElementsByTagName('script') as $script){
$script->nodeValue = str_replace("<", "ESCAPE_CHAR_LT", $script->nodeValue);
$script->nodeValue = str_replace(">", "ESCAPE_CHAR_GT", $script->nodeValue);
}
//Obtaining XHTML
$output = $dom->saveXML();
//Seccond replace
$output = str_replace("ESCAPE_CHAR_LT", "<", $output);
$output = str_replace("ESCAPE_CHAR_GT", ">", $output);
//Print document
echo $output;
?>
As you can see, now you are free to use "<" ">" in your scripts.
Hope this helps someone.
Upvotes: 0
Reputation: 43253
I have a suspicion that the CDATA is inserted because it would otherwise be invalid XML.
Have you tried using saveHTML
instead of saveXML
?
Upvotes: 3