Julien Rns Neo
Julien Rns Neo

Reputation: 59

Blank spaces with XPath (PHP)

Im trying to code a "robot" that crawl a forum to make stats.

Here is my code : https://pastebin.com/6zAaQ0fF

    <?php

$ch = curl_init();
$timeout = 0; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, 'http://m.jeuxvideo.com/forums/42-51-61922886-1-0-1-0-once-upon-time-in-hollywood.htm');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);


$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($file_contents);

$xpath = new DOMXPath($dom);
$posts = $xpath->query("//div[@class='who-post']");//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");
$dates = $xpath->query("//div[@class='date-post']");//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");
$contenus = $xpath->query("//div[@class='contenu']");//$elements = $xpath->query("/html/body/div[@id='yourTagIdHere']");




foreach ($posts as $post) {
    $nodes = $post->childNodes;

    foreach ($nodes as $node) {
    $value = trim($node->nodeValue);
      echo $node->nodeValue;
      $tab[] = $node->nodeValue;



    }

}


foreach ($dates as $date) {

    $nodes = $date->childNodes;
    foreach ($nodes as $node) {
       echo trim($node->nodeValue);
    }

}

?>
<pre>
<?php 
print_r($tab);
?>
</pre>

I dont undederstand why I receive some blank spaces in my array while its correctly works when using echo function...

enter image description here Thank you for your help ! Helpp

Upvotes: 0

Views: 159

Answers (1)

Syscall
Syscall

Reputation: 19780

You could get the <a> tag of posts.

$posts = $xpath->query("//div[@class='who-post']/a");

Also, you don't use the trimmed value (in the first loop) :

$value = trim($node->nodeValue);
$tab[] = $node->nodeValue;

Change to:

$value = trim($node->nodeValue);
$tab[] = $value;

Output:

Array
(
    [0] => Thewiitcheur
    [1] => Thewiitcheur
    [2] => Shaq24
    [3] => Downy-down
    [4] => LosyCITY
    [5] => DanaAndrews
    [6] => Racouske
    [7] => Gnagngan
    [8] => harvey-specter
    [9] => frivyhotasmr
    [10] => Jowst
    [11] => Thewiitcheur
    [12] => ChibreCarnivore
    [13] => pseudobanni5678
    [14] => Chimpanzee
    [15] => EncoreBan25
    [16] => spagetthivolant
    [17] => Chimpanzee
    [18] => JeromeGerber
    [19] => chopsueys
)

Upvotes: 1

Related Questions