lovesh
lovesh

Reputation: 5391

downloading a page without downloading image files or css or javascript with curl

Whenever i use curl(php) to download a page it downloads everything on the page like images, css files or javascript files. but sometimes i dont want to download these. can i control the resources that i download through curl. i have gone through the manual but i havent found an option that can make this happen? Please dont suggest getting the whole page and then using some regex magic because that would still download the page and increase load time. this is a demo code where i download a page from mozilla.com

<?php
$url="http://www.mozilla.com/en-US/firefox/new/";
$userAgent="Mozilla/5.0 (Windows NT 5.1; rv:2.0)Gecko/20100101 Firefox/4.0";
//$accept="text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$encoding="gzip, deflate";
$header['lang']="en-us,en;q=0.5";
$header['charset']="ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header['conn']="keep-alive";
$header['keep-alive']=115;

$ch=curl_init();
curl_setopt($ch,CURLOPT_USERAGENT,$userAgent);
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_ENCODING,$encoding);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ch,CURLOPT_AUTOREFERER,1);
$content=curl_exec($ch);
curl_close($ch);
echo $content;
?>

when i echo the content it shows the images too. i saw in firebug's network tab that images and external js files are being downloaded

Upvotes: 0

Views: 2030

Answers (2)

ajreal
ajreal

Reputation: 47311

you can avoid the download by using

echo htmlentities($content);

Upvotes: 1

Marc B
Marc B

Reputation: 360572

PHP's curl only fetches what you tell it to. It doesn't parse html to look for javascript/css <link> tags and <img> tags and doesn't fetch them automatically.

If you have curl downloading those resources, then it's your code telling it to do so, and it's up to you to decide what to fetch and what not to. Curl only does what you tell it to.

Upvotes: 1

Related Questions