learner
learner

Reputation: 171

How check if website is gzip enabled or not?

I' using curl to check if any given website is gzip enabled or not. I'm using the following code to check this.

$ch = curl_init('website name');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);

curl_setopt($ch, CURLINFO_HEADER_OUT, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Encoding: gzip, deflate',
    'Accept-Language: en-US,en;q=0.5',
    'Connection: keep-alive',
    'SomeBull: BeingIgnored',
    'User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20100101 Firefox/16.0'
  )
);
$response = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

print_r($info);

Each time result shows gzip option included in result. even those sites where gzip is not enabled like http://pramitscomicsdump.com so can you advise what i'm doing wrong. I just need to check if gzip is enabled or not using curl.

I learned to use this command to check

curl -I -H 'Accept-Encoding: gzip,deflate' 'site name'

but i'm unable to run this command in php.

Upvotes: 1

Views: 958

Answers (1)

drew010
drew010

Reputation: 69937

Good effort with your code.

There are a few things to consider when checking for this:

  1. Just because you ask for gzip, doesn't mean you'll get it, so you need to actually check the response headers to see if it was gzip compressed. (Side note: Some pages on a site may use gzip, others might not, but there's a good chance the "home" page will).
  2. You might want to use CURLOPT_FOLLOWLOCATION in case you're redirected. If you are redirected, cURL will return multiple sets of headers so you'll want to check the headers from the final request.

Here's some code to get you started:

<?php

$ch = curl_init('http://example.com');

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); // follow redirects
curl_setopt($ch, CURLOPT_HEADER, 1); // include headers in curl response
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Encoding: gzip, deflate', // request gzip
    'Accept-Language: en-US,en;q=0.5',
    'Connection: keep-alive',
    'SomeBull: BeingIgnored',
    'User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:16.0) Gecko/20100101 Firefox/16.0'
  )
);
$response = curl_exec($ch);

if ($response === false) {
    die('Error fetching page: ' . curl_error($ch));
}

$info = curl_getinfo($ch);

for ($i = 0; $i <= $info['redirect_count']; ++$i) {
    // split request and headers into separate vars for as many times 
    // as there were redirects
    list($headers, $response) = explode("\r\n\r\n", $response, 2);
}

curl_close($ch);

$headers = explode("\r\n", $headers); // split headers into one per line
$hasGzip = false;

foreach($headers as $header) { // loop over each header
    if (stripos($header, 'Content-Encoding') !== false) { // look for a Content-Encoding header
        if (strpos($header, 'gzip') !== false) { // see if it contains gzip
            $hasGzip = true;
        }
    }
}

var_dump($hasGzip);

Upvotes: 1

Related Questions