Reputation: 137
I'm using Simple HTML Dom to parse text between HTML tags.Everything went well until i faced this challenge.I can easily parse text within div tags but how can I parse text between two div tags.
This is the HTML to be parsed:
<div class="album"><b>Album1</b> (1997)</div>
<a href="song11.html" target="_blank">song11</a><br />
<a href="song12.html" target="_blank">song12</a><br />
<div class="album"><b>Album2</b> (1998)</div>
<a href="song21.html" target="_blank">song21</a><br />
<a href="song22.html" target="_blank">song22</a><br />
<div class="album"><b>Album3</b> (1999)</div>
<a href="song31.html" target="_blank">song31</a><br />
<a href="song32.html" target="_blank">song32</a><br />
I want the first album title (Album1), its year (1997) and both the song links with their titles in a single array. Then the second album in a second array and the third album in a third array.
Upvotes: 2
Views: 2143
Reputation: 54984
Don't think of it as text between two div nodes, think of it as iterating the div nodes and including some of the a nodes that follow them:
$html =<<<EOF
<div class="album"><b>Album1</b> (1997)</div>
<a href="song11.html" target="_blank">song11</a><br />
<a href="song12.html" target="_blank">song12</a><br />
<div class="album"><b>Album2</b> (1998)</div>
<a href="song21.html" target="_blank">song21</a><br />
<a href="song22.html" target="_blank">song22</a><br />
<div class="album"><b>Album3</b> (1999)</div>
<a href="song31.html" target="_blank">song31</a><br />
<a href="song32.html" target="_blank">song32</a><br />
EOF;
require('simple_html_dom.php');
$doc = str_get_html($html);
$albums = array();
foreach($doc->find('div.album') as $div){
$album = array();
$album['title'] = $div->find('b', 0)->innertext;
$album['song1'] = $div->nextSibling()->innertext;
$albums[] = $album;
}
var_dump($albums);
Upvotes: 2