Reputation: 1491
I used the below code and successfully collected the data from a specific page as follows:
include 'simplehtmldom/simple_html_dom.php';
$html = file_get_html('http://test.com/file/1209i0329/');
// Find all article blocks
foreach($html->find('div.Content') as $file) {
$item['date'] = $file->find('id.article-date', 0)->plaintext;
$item['location'] = $file->find('id.article-location', 0)->plaintext;
$item['price'] = $file->find('div.article', 0)->plaintext;
$files[] = $item;
}
print_r($files);
The code works well for http://test.com/file/1209i0329.php
, but my goal is to collect data from all pages starting with http://test.com/file/
on this domain (For example, http://test.com/file/1209i0329/
, http://test.com/file/120dnkj329/
, and etc). Is there a solution to overcome this problem using simle_html_dom
?
Upvotes: 1
Views: 2481
Reputation: 7948
I dont know where you would search your files (same domain, or outside), you may need to loop an array containing the urls of what you want to search.
Consider this example:
include 'simplehtmldom/simple_html_dom.php';
// most likely this process will take some time
$files = array();
$urls = array(
'http://test.com/file/1209i0329/',
'http://test.com/file/120dnkj329/',
'http://en.wikipedia.org/wiki/',
);
foreach($urls as $url) {
$html = file_get_html($url);
// Find all article blocks
foreach($html->find('div.Content') as $file) {
$item['date'] = $file->find('id.article-date', 0)->plaintext;
$item['location'] = $file->find('id.article-location', 0)->plaintext;
$item['price'] = $file->find('div.article', 0)->plaintext;
$files[] = $item;
}
}
print_r($files);
Upvotes: 3