Reputation: 159
i want to find class ft00 between Work Experience and EDUCATION AND TRAINING and extract class text which contains dates from the given html
<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">[email protected]</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>
so far i could get is to extract all data between Work Experience and EDUCATION AND TRAINING and it's working properly and the code is given below:-
$fexp = $html->find('p[plaintext^=Work Experience]');
$items = array();
foreach ($fexp as $keye) {
while ( $keye->nextSibling() ) {
if ( $keye->nextSibling() == TRUE ) {
$keye = $keye->nextSibling();
$varce = $keye->plaintext;
}
if ( trim($varce) == "EDUCATION AND TRAINING" ){
break;
}
//$test[] = $collection;
$items[] = $varce;
// echo $varce;
}
}
var_dump($items);
i am close but can't seem to find out the solution, any help would be appreciated thanks :-)
Upvotes: 1
Views: 1782
Reputation: 159
Here is the proper working code:-
$test = array();
$matching = false;
$collection = $html->find('p.ft00');
foreach ($collection as $tkey) {
if ($tkey->plaintext == "WORK EXPERIENCE" || $matching ) {
$test[] = $tkey->plaintext;
$matching = true;
}
if ( $tkey->plaintext == "EDUCATION AND TRAINING") {
break;
}
}
var_dump($test);
Output:-
Array
(
[0] => Work Experience
[1] => 27 July 2017
[2] => 19 May 2018
[3] => EDUCATION AND TRAINING
)
Upvotes: 1
Reputation: 46602
With DOMDocument and DOMXPath you could do it like the following, I've never used Simple HTML DOM Parser but I'm presuming it has XPath.
<?php
$dom = new DOMDocument();
$dom->loadHtml('
<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">[email protected]</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>
', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
$result = [];
$matching = false;
foreach ($xpath->query("//p[contains(@class, 'ft00') or contains(@class, 'ft02')]/text()") as $p) {
if ($p->nodeValue === 'Work Experience' || $matching) {
$result[] = $p->nodeValue;
$matching = true;
}
if ($p->nodeValue === 'EDUCATION AND TRAINING') {
break;
}
}
print_r($result);
Result:
Array
(
[0] => Work Experience
[1] => 27 July 2017
[2] => ABC Company
[3] => 19 May 2018
[4] => XYZ Company
[5] => EDUCATION AND TRAINING
)
Upvotes: 3