yarek
yarek

Reputation: 12044

Do not download response body when 404 is found

I use file_get_contents to fetch remote pages. Many of pages return 404 error, with a customized (and heavy 404 page)

Is there a way to stop and not download the whole page when 404 header is found?

(maybe curl or wget can do that ?)

Upvotes: 0

Views: 322

Answers (2)

JoSSte
JoSSte

Reputation: 3372

I would do the following:

$pageUrl = "http://www.example.com/myfile/which/may/not.exist";
$headers = get_headers($pageUrl);
//check header before downloading
if($headers[0] == "HTTP/1.1 200 OK"){
  //OK - download
  $download = file_get_contents($pageUrl);
}else if($headers[0] == "HTTP/1.1 404 NOT FOUND"){
  //NOT OK - show error
}

you could also do a indexof instead.

based on PHPs manual page for get_headers

Sample output:

Array
(
    [0] => HTTP/1.1 200 OK
    [1] => Date: Sat, 29 May 2004 12:28:13 GMT
    [2] => Server: Apache/1.3.27 (Unix)  (Red-Hat/Linux)
    [3] => Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
    [4] => ETag: "3f80f-1b6-3e1cb03b"
    [5] => Accept-Ranges: bytes
    [6] => Content-Length: 438
    [7] => Connection: close
    [8] => Content-Type: text/html
)

Upvotes: 0

Quentin
Quentin

Reputation: 943591

No, this isn't possible.

HTTP provides some scope for conditional requests (such as If-Modified-Since), but none that trigger on the status code.

The closest you could come would be to make a HEAD request and then, if you don't get an error code back, make a GET request afterwards. You'd probably lose more to having two requests for every good resource than you would gain in not getting the bodies of bad resources.

Upvotes: 2

Related Questions