Reputation: 21
I am trying to make a software that will get user keyword and search it on google, find all the sites that have pdf files against that word and download them. i was able to get html of google search result against keyword, but those html links are not of any use and i can't download pdf files from them.
<?php
if(isset($_POST['submit'])){
$endpoint =$_POST['info'];
$endpoint = str_replace(' ', '+', $endpoint);
$endpoint= $endpoint.'+pdf';
$page = file_get_contents('https://www.google.com.pk/search?dcr=0&source=hp&q='.$endpoint.'&oq='.$endpoint.'&gs_l=psy-ab.3..35i39k1l2j0j0i131k1j0l3j0i131k1j0l2.73519.74668.0.75122.9.7.0.0.0.0.424.424.4-1.1.0....0...1.1.64.psy-ab..8.1.422.0...0.U3V3CxpsqhA');
$dom = new DOMDocument;
@$dom->loadHTML($page);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link){
echo $link->nodeValue;
echo $link->getAttribute('href'), '<br>';
}
}
?>
this is what i have to get html of google search result. i am kind of stuck here, Please guide me what should i do now.
Upvotes: 0
Views: 14699
Reputation: 3559
I think you should request the file at the link you just crawled with the correct header:
<?php
header("Content-type:application/pdf");
header("Content-Disposition:attachment;filename='downloaded.pdf'");
Or use cURL.
Note that header()
must be called before any other output, so maybe you could divide your app flow in two/three steps:
check this other answer: https://stackoverflow.com/a/20080402/3279175
Upvotes: 2
Reputation: 5203
Try using file_put_contents and fopen:
$url = 'http:// ... ';
file_put_contents('file.pdf', fopen($url, 'r'));
Upvotes: 1