Reputation: 627
I am a newbie to use simple html dom with php and I am struggling to extract multiple html tags from one class. I have multiple blocks of html like this in a single page
<div class="file-right">
<a href="/secrets-of-the-millionaire-mind-tomocubcom-e17682584.html" class="ai-similar" data-id="17682584" data-loc="3">
<h2><b>Secrets</b> of the <b>Millionaire</b> <b>Mind</b> - TOMOCUB.COM</h2>
</a>
<span class="fi-pagecount">223 Pages</span>
<span class="fi-year">2005</span>
<span class="fi-size hidemobile">1015 KB</span>
</div>
2 - <b>Secrets</b> of the <b>Millionaire</b> <b>Mind</b> and your achievement of <b>success</b>. As you’ve probably fo ...
</div>
and from each block this html I want to extract
I have been doing it in php but getting errors again and again. This is the code what i have uptill now
$html = @str_get_html($response);
$allblocks=$html->find('div.file-right'); //this selects all file-right blocks
if(isset($allblocks)){
foreach($allblocks as $singleblock){
echo $singleblock->plaintext; // but i get an error here PHP Notice: Array to string conversion
}
}
Can anyone help me please.
Upvotes: 2
Views: 928
Reputation: 57121
You need to build up the various layers of taking the HTML apart, you started by finding the <div>
tag. You can from that find the <a>
tag within this <div>
and then get the href attribute (using ->href
). This code assumes that there is only one <a>
tag, so rather than a foreach
I just say use the first one (using [0]
).
The <span>
tags is a similar process, but as there are repeated elements, this time it uses a foreach
. This code outputs the class attribute and the contents of the span.
$html = str_get_html($response);
$allblocks=$html->find('div.file-right'); //this selects all file-right blocks
if ( count($allblocks) > 0 ){
foreach ( $allblocks as $block ) {
$anchor = $block->find("a");
echo "href=".$anchor[0]->href.PHP_EOL;
echo "text=".$anchor[0]->plaintext.PHP_EOL;
$spans = $block->find("span");
foreach ( $spans as $span ) {
echo "span=".$span->class."=".$span->plaintext.PHP_EOL;
}
}
}
Note that when in your original code you used isset($allblocks)
, as the line before set it's value - even if it didn't find anything it will still have a value. In this code I use count()
to check if anything is returned by the previous call to find()
.
With your sample HTML, wrapped only in a minumum page, the output is...
href=/secrets-of-the-millionaire-mind-tomocubcom-e17682584.html
text= Secrets of the Millionaire Mind - TOMOCUB.COM
span=fi-pagecount=223 Pages
span=fi-year=2005
span=fi-size hidemobile=1015 KB
Upvotes: 1