Captain Payalytic
Captain Payalytic

Reputation: 1071

False eof from feof() with sockets fgets

I have inherited a piece of code which uses the fetchURL() function below to grab data from a url. I've just noticed that it is often getting feof() returning true before the full page of data is retrieved. I have tried some tests and using CURL of file_get_contents() both retrieve the full page every time.

The error is intermittent. On 9 calls, sometimes 7 will complete successfully and sometimes only 4. A particular 4 of the 9 (they are get requests with just a changing query string) always complete successfully. I have tried reversing the order of the requests and the same 4 query strings are still always successful whilst the remainder sometimes work and sometimes don't.
So it "seems" that the data being returned may have something to do with the problem, but it's the intermittent nature that has got me foxed. The data returned in each case is always the same (as in, every time I make a call with a query string of ?SearchString=8502806 the page returned contains the same data), but sometimes the full page is delivered by fgets/feof and sometimes not.

Does anyone have a suggestion as to what may be causing this situation? Most other posts O have seen on this subject are regarding the opposite problem whereby feof() is not returning true.

function fetchURL( $url, $ret = 'body' ) {
    $url_parsed = parse_url($url);
    $host = $url_parsed["host"];
    $port = (isset($url_parsed["port"]))?$url_parsed["port"]:'';
    if ($port==0)
        $port = 80;
    $path = $url_parsed["path"];
    if ($url_parsed["query"] != "")
        $path .= "?".$url_parsed["query"];

    $out = "GET $path HTTP/1.0\r\nHost: $host\r\n\r\n";

    $fp = fsockopen($host, $port, $errno, $errstr, 30);

    fwrite($fp, $out);
    $body = false;
    $h = '';
    $b = '';
    while (!feof($fp)) {
        $s = fgets($fp, 1024);
        if ( $body )
            $b .= $s;
        else
            $h .= $s;
        if ( $s == "\r\n" )
            $body = true;
    }

    fclose($fp);

    return ($ret == 'body')?$b:(($ret == 'head')?$h:array($h, $b));
}

Upvotes: 0

Views: 2822

Answers (2)

Tom van der Woerdt
Tom van der Woerdt

Reputation: 29985

I see quite a few things wrong with that code.

  • Don't ever use feof on sockets. It'll hang until the server closes the socket, which does not necessarily happen immediately after the page was received.
  • feof might return true (socket is closed) while PHP still has some data in its buffer.
  • Your code to distinguish header from body seems to rely on PHP doing it's job properly, which is generally a bad idea. fgets doesn't necessarily read a line, it can also return just a single byte (\r, then the next call you might get the \n)
  • You're not properly encoding the path value

Why don't you just convert your code to use cURL or file_get_contents?

Upvotes: 2

grahamj42
grahamj42

Reputation: 2762

It sounds like a timeout problem to me. See stream_set_timeout() in the PHP manual.

Upvotes: -1

Related Questions