user970956
user970956

Reputation: 339

Download file using CURL from URL

I'm trying to download a file using PHP and CURL. If I open the link in a browser I would get an xlsx file, but with PHP when I save the file it can't be opened. I find out that if I use PHP to save the content of url the file is a gzip file and if I save it as a zip file I can open it and it's ok. The problem is that I want the extracted file on the server to work with and I can't extract the zip file because zip archive says that it's not a correct zip file. This is the code I'm using :

$fp = fopen ('file.zip', 'w+');

// Here is the file we are downloading, replace spaces with %20
$ch = curl_init(str_replace(" ","%20","http:members.tsetmc.com/tsev2/excel/MarketWatchPlus.aspx?d=0"));
curl_setopt($ch, CURLOPT_TIMEOUT, 50);

//  write curl response to file
curl_setopt($ch, CURLOPT_FILE, $fp); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);

//   get curl response
$ex = curl_exec($ch);
var_dump($ex); 
curl_close($ch);
fclose($fp);

$file = "images";
$zip = new ZipArchive;
$path = realpath($file);
$res = $zip->open("file.zip");

if ($res === TRUE) {
    $extract = $zip->extractTo($path);
    var_dump($extract);
    if ($extract){
    $zip->close();
    echo "WOOT! $file extracted to $path";
}else{
    echo $zip->getStatusString(); 
    echo 'not extracte';
}

} else {
    echo $zip->getStatusString(); 
    echo "Doh! I couldn't open $file";
}

So my question is this. How can I have the excel file from that URL in my host?

I have tried so many thing and neither is working.

Thanks

Upvotes: 0

Views: 2042

Answers (2)

Ali Khalili
Ali Khalili

Reputation: 1674

You should mention the gzip encoding (as CURLOPT_ENCODING), then you don't need to extract the file (and the downloaded file could be opened directly):

$url = 'http://members.tsetmc.com/tsev2/excel/MarketWatchPlus.aspx?d=0';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');

$result = curl_exec($ch);
if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
}
curl_close($ch);

file_put_contents('test.xlsx', $result);

Upvotes: 0

Professor Abronsius
Professor Abronsius

Reputation: 33813

function curl( $url=NULL, $options=NULL, $headers=false ){
    $cacert='c:/wwwroot/cacert.pem';    #EDIT THIS TO SUIT
    $vbh = fopen('php://temp', 'w+');

    session_write_close();

    $curl=curl_init();
    if( parse_url( $url,PHP_URL_SCHEME )=='https' ){
        curl_setopt( $curl, CURLOPT_SSL_VERIFYPEER, true );
        curl_setopt( $curl, CURLOPT_SSL_VERIFYHOST, 2 );
        curl_setopt( $curl, CURLOPT_CAINFO, $cacert );
    }
    curl_setopt( $curl, CURLOPT_URL,trim( $url ) );
    curl_setopt( $curl, CURLOPT_AUTOREFERER, true );
    curl_setopt( $curl, CURLOPT_FOLLOWLOCATION, true );
    curl_setopt( $curl, CURLOPT_FAILONERROR, true );
    curl_setopt( $curl, CURLOPT_HEADER, false );
    curl_setopt( $curl, CURLINFO_HEADER_OUT, false );
    curl_setopt( $curl, CURLOPT_RETURNTRANSFER, true );
    curl_setopt( $curl, CURLOPT_BINARYTRANSFER, true );
    curl_setopt( $curl, CURLOPT_CONNECTTIMEOUT, 20 );
    curl_setopt( $curl, CURLOPT_TIMEOUT, 60 );
    curl_setopt( $curl, CURLOPT_USERAGENT, 'Mozilla/5.0' );
    curl_setopt( $curl, CURLOPT_MAXREDIRS, 10 );
    curl_setopt( $curl, CURLOPT_ENCODING, '' );
    curl_setopt( $curl, CURLOPT_VERBOSE, true );
    curl_setopt( $curl, CURLOPT_NOPROGRESS, true );
    curl_setopt( $curl, CURLOPT_STDERR, $vbh );

    if( isset( $options ) && is_array( $options ) ){
        foreach( $options as $param => $value ) curl_setopt( $curl, $param, $value );
    }
    if( $headers && is_array( $headers ) ){
        curl_setopt( $curl, CURLOPT_HTTPHEADER, $headers );
    }
    $res=(object)array(
        'response'  =>  curl_exec( $curl ),
        'info'      =>  (object)curl_getinfo( $curl ),
        'errors'    =>  curl_error( $curl )
    );
    rewind( $vbh );
    $res->verbose=stream_get_contents( $vbh );
    fclose( $vbh );
    curl_close( $curl );

    return $res;
}




/* Make the curl request and save the file */
$url='http://members.tsetmc.com/tsev2/excel/MarketWatchPlus.aspx?d=0';

$saveto='c:/temp/downloaded_excel_file.xlsx'; #EDIT TO SUIT

$fp=fopen( $saveto, 'w' );
$options=array( CURLOPT_FILE => $fp );
$res=curl( $url, $options );
fclose( $fp );

if( $res->info->http_code==200 ){
    echo "OK";
}

This happily saves the xlsx file which can then be opened in Excel. The size of the file saved with this is 145Kb rather than 141Kb with the original code

Upvotes: 2

Related Questions