Reputation: 3083
I want to convert into a string the html contained between these comments
<!--content-start-->
desired html
<!--content-end-->
so I use pregmatch, right?
preg_match("/<!--content-start-->(.*)<!--content-end-->/i", $rss, $content);
but it wont work. Maybe a problem with the REGEX?
Thank you.
Upvotes: 0
Views: 538
Reputation: 42690
Something like this should work. The XPath query looks for a comment containing "content-start" and then returns the sibling nodes following it. We loop through until we find the closing comment.
$html = <<< HTML
<!--content-start-->
<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>
<!--content-end-->
<p>Not returning this</p>
HTML;
$return = "";
$dom = new DomDocument;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
$xpath = new DomXpath($dom);
$siblings = $xpath->query("//comment()[.='content-start']/following-sibling::node()");
foreach ($siblings as $node) {
if ($node instanceof DOMComment && $node->textContent === "content-end") {
break;
}
$return .= $dom->saveHTML($node) . "\n";
}
echo $return;
Output:
<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>
Upvotes: 1
Reputation: 620
Perhaps a /s
modifier will help. Check the documentation:
s (PCRE_DOTALL)
If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.
Upvotes: 1