Russel Daniel
Russel Daniel

Reputation: 21

Array Numbering Issue

Why is this code able to fetch data from the following first page and insert them into an array by numbering the array, while it fails to do the same for the following second page:

http://nimishprabhu.com

https://www.fiverr.com/search/gigs?utf8=%E2%9C%93&source=guest-homepage&locale=en&search_in=everywhere&query=php

The page shows arrays numbered like the following, which is not correct:

Array ( [0] => mailto:[email protected] ) 
Array ( [0] => https://collector.fiverr.com/api/v1/collector/noScript.gif?appId=PXK3bezZfO
        [1] => https://collector.fiverr.com/api/v1/collector/pxPixel.gif?appId=PXK3bezZfO ) 
Array ( [0] => One Small Step )

Code:

<?php

/*
2.
FINDING HTML ELEMENTS BASED ON THEIR TAG NAMES

Suppose you wanted to find each and every image on a webpage or say, each 
and every hyperlink. 
We will be using “find” function to extract this information from the 
object. Doing it using Simple HTML DOM Parser :
*/

include('simple_html_dom.php');

$html = file_get_html('https://www.fiverr.com/search/gigs?utf8=%E2%9C%93&source=guest-homepage&locale=en&search_in=everywhere&query=php');

//to fetch all hyperlinks from a webpage
$links = array();
foreach($html->find('a') as $a) {
  $links[] = $a->href;
}
print_r($links);
echo "<br />";

//to fetch all images from a webpage
$images = array();
foreach($html->find('img') as $img) {
  $images[] = $img->src;
}
print_r($images);
echo "<br />";

//to find h1 headers from a webpage
$headlines = array();
foreach($html->find('h1') as $header) {
  $headlines[] = $header->plaintext;
}
print_r($headlines);
echo "<br />";

?>

Any suggestions and code samples welcome for my learning purpose. I am a self study student.

Upvotes: 0

Views: 70

Answers (1)

Damian Stępień
Damian Stępień

Reputation: 154

The reason is that the page you are trying to download (fiverr.com) is JavaScript-based with dynamically loaded content. This will not work in PHP, because it only sees the HTML that was sent by the server, it can't parse and run JavaScript. Because this is for learning purposes, you can simply try a different website.

However, if you want a working solution, you should look into Selenium. It's basically a headless web browser which does everything like other browsers, including running JavaScript. Through its web driver you will be able to fully parse websites like fiverr.com.

Upvotes: 2

Related Questions