Reputation: 453
I'm unable to figure out to get text between html tags. in my scenario required text is not wrapped between tags except paragraph tag <p>
.
<div class="entry clearfix">
<p>111</p>
<p><img class="alignnone size-medium wp-image-38376" src="1.jpg" alt="Talvar" /></p>
<p><strong>111: </strong>111<br/>
<strong>111:</strong> 111<br/>
<strong>111:</strong> 111 111<br/>
<strong>111: </strong>111<br/>
<strong>111: </strong>1111
</p>
<p><strong>111</strong></p>
<p>
<strong>01 –</strong> data1 <strong><a href="#">Download</a><br/>
</strong><em>222</em><br/>
<strong>02 –</strong> data2 <strong><a href="#">Download</a><br/>
</strong><em>222</em><br/>
<strong>03 –</strong> data3 <strong><a href="#">Download</a><br/>
</strong><em>222</em><br/>
<strong>04 –</strong> data4 <strong><a href="#">Download</a><br/>
</strong><em>222</em>
</p>
<p><strong>222</strong></p>
<p><strong><a href="" target="_blank">3333</a></strong></p>
<p><strong>eb</strong></p></div>
i need data1, data2, data3, data4. for that i am finding <p>
which is number 5 as in array number 4.
foreach($html->find('div[class="entry"]') as $row){
$a = $row->find('p',4);
echo $dt = $a->find('text',1)->plaintext; // returns me only data1
}
data1, data2, data3, data4 are not between any tags except <p>
if i get them through striptags()
it returns all texts along with 111, Download, 222 etc. please advise how i can get data series.
Upvotes: 4
Views: 450
Reputation: 11318
Not sure about more elegant ways, but this should work too:
foreach($html->find('div[class="entry"]') as $row){
$a = $row->find('p',4);
$str=$a->find('strong');
$em=$a->find('em');
foreach($str as $tag) {
$a=str_replace($tag,'',$a);
$a=str_replace($em,'',$a);
}
}
echo strip_tags($a,'<br>'); // if you want to keep br tags
So, idea is - remove strong
and em
tags (and text content inside, including links), inside targeted p
, with str_replace
, and get the rest.
If your HTML structure is like this one you've posted, it should work.
Upvotes: 1