Reputation: 12466
I don't want to download the whole web page. It will take time and it needs lot of memory.
How can i download portion of that web page? Then i will parse that.
Suppose i need to download only the <div id="entryPageContent" class="cssBaseOne">...</div>
. How can i do that?
Upvotes: 2
Views: 2776
Reputation: 360702
You can't download a portion of a URL by "only this piece of HTML". HTTP only supports byte ranges for partial downloads and has no concept of HTML/XML document trees.
So you'll have to download the entire page, load it into a DOM parser, and then extract only the portion(s) you need.
e.g.
$html = file_get_contents('http://example.com/somepage.html');
$dom = new DOM();
$dom->loadHTML($html);
$div = $dom->getElementById('entryPageContent');
$content = $div->saveHTML();
Upvotes: 5
Reputation: 7389
Using this:
curl_setopt($ch, CURLOPT_RANGE, "0-10000");
will make cURL download only the first 10k bytes of the webpage. Also it will only work if the server side supports this. Many interpreted scripts (CGI, PHP, ...) ignore it.
Upvotes: 0