Liviu ZeJah
Liviu ZeJah

Reputation: 113

xPath data explode based on new line

In my current project, with xPath I'm parsing this HTML code :

<div class="prod_view_info">
    <p>notimportant</p><p>notimportant</p>
    <p>
        <b>Putere (Consum/ora): </b> 3W<br> 
        <b>Fasung: </b> MR16<br> 
        <b>Tensiune de alimentare: </b> 12V<br> 
        <b>Flux luminos: </b> 270 - 300 lm<br> 
        <b>Flux luminos per Watt: </b> 90-100 lm/W<br> 
        <b>Putere LED: </b> 3* 1W<br> 
        <b>Frecventa de lucru: </b> DC<br> 
        <b>Culoarea luminii: </b>alba calda<br> 
        <b>Temperatura: </b> 30 / 50°C<br> 
        <b>Material: </b> Aluminiu<br> 
        <b>Grad de protectie: </b> IP21<br> 
        <b>Durata de viata: </b> &gt; 50000 ore<br> 
        <b>Garantie: </b>2 ani<br>
        <b>Certificate: </b> 
    </p>
</div>

with the following PHP code :

foreach( $xpathprd->query('//div[@class="prod_view_info"]/p[3]/node()') as $techdata ) {      

  $techdatap = $techdata->nodeValue; 
  $techdatapChunks = explode(":", trim($techdatap));
  $producttechdata[] = $techdatapChunks;
}

$techdata_json = json_encode($producttechdata);

echo "<hr>$techdata_json";

The point is to get the pair of information on each line and serialize it with json, for example :

<b>Putere (Consum/ora): </b> 3W<br>

should be :

["Putere (Consum/ora)","3W"]

But xpath strips out the tags and somewhere I'm missing some space and the explode result is all messed up .

How do I make this work ?

Upvotes: 1

Views: 338

Answers (1)

Kevin
Kevin

Reputation: 41873

You could use some combination of ->childNodes and ->nextSibling in this case so that its easier to detect. Example: Sample Demo

$dom = new DOMDocument();
$dom->loadHTML($html_string);
$xpathprd = new DOMXPath($dom);

$data = array();
$elements = $xpathprd->query('//div[@class="prod_view_info"]/p[3]')->item(0);
foreach($elements->childNodes as $child) {
    if(isset($child->tagName) && $child->tagName == 'b') { // check if its a `<b>` tag
        $key = trim($child->nodeValue);
        $value = trim($child->nextSibling->nodeValue); // get the next sibling which is a domtext
        $data[] = array($key, $value); // push it inside
    }

}

echo '<pre>';
$data = json_encode($data);
print_r($data);

Should yield:

[["Putere (Consum\/ora):","3W"],["Fasung:","MR16"],["Tensiune de alimentare:","12V"],["Flux luminos:","270 - 300 lm"],["Flux luminos per Watt:","90-100 lm\/W"],["Putere LED:","3* 1W"],["Frecventa de lucru:","DC"],["Culoarea luminii:","alba calda"],["Temperatura:","30 \/ 50\u00c2\u00b0C"],["Material:","Aluminiu"],["Grad de protectie:","IP21"],["Durata de viata:","> 50000 ore"],["Garantie:","2 ani"],["Certificate:",""]]

Upvotes: 1

Related Questions