Reputation: 2097
I am trying to get data between this two div:
--<div id="p_tab4" class="p_desc" style="display: block;">
--<div id="p_top_cats" class="p_top_cats">
I am using the below regex, but it's not getting me anything:
/<div id=\"p_tab4\" class=\"p_desc\" style=\"display: block;\">(.*?)<div id=\"p_top_cats\" class=\"p_top_cats\">/
How can I correct this regex?
Upvotes: 0
Views: 605
Reputation: 173662
Assuming your HTML is somewhat well formed, like below:
<div id="p_tab4" class="p_desc" style="display: block;">...</div>
some stuff in between
<div id="p_top_cats" class="p_top_cats">
</div>
You can make use of DOMDocument and XPath:
$html = <<<'EOS'
<div id="p_tab4" class="p_desc" style="display: block;">
some stuff in between
<div id="p_top_cats" class="p_top_cats">
</div>
</div>
EOS;
$doc = new DOMDocument;
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$query = '//node()[preceding-sibling::div[@id="p_tab4"] and following-sibling::div[@id="p_top_cats"]]';
foreach ($xpath->query($query) as $node) {
echo $node->textContent, PHP_EOL;
}
Upvotes: 2
Reputation: 174874
Seems nothing wrong with your regex but you need to turn on the DOTALL mode s
, so that the dot in your regex will also matches the newline character (line breaks).
~<div id=\"p_tab4\" class=\"p_desc\" style=\"display: block;\">(.*?)<div id=\"p_top_cats\" class=\"p_top_cats\">~s
Code:
$re = '~<div id=\"p_tab4\" class=\"p_desc\" style=\"display: block;\">(.*?)<div id=\"p_top_cats\" class=\"p_top_cats\">~s';
$str = "--<div id=\"p_tab4\" class=\"p_desc\" style=\"display: block;\">\n--<div id=\"p_top_cats\" class=\"p_top_cats\">";
preg_match($re, $str, $matches);
echo $matches[1];
Upvotes: 1