fire
fire

Reputation: 21531

Retrieving page with fsockopen adds numbers to returned string

This is very strange, on some pages it will return the HTML fine, others it will add numbers to the beginning and end of the returned string ($out).

function lookupPage($page, $return = true) {
    $fp = fsockopen("127.0.0.1", 48580, $errno, $errstr, 5);        
    if (!$fp) {
        return false;
    }
    else {
        $out = "";
        $headers = "GET /" . $page . " HTTP/1.1\r\n";
        $headers .= "Host: www.site.com\r\n";
        $headers .= "Connection: Close\r\n\r\n";
        fwrite($fp, $headers);
        stream_set_timeout($fp, 300);
        $info = stream_get_meta_data($fp);
        while (!feof($fp) && !$info['timed_out'] && ($line = stream_get_line($fp, 1024)) !== false) {
            $info = stream_get_meta_data($fp);
            if ($return) $out .= $line;
        }
        fclose($fp);
        if (!$info['timed_out']) {
            if ($return) {
                $out = substr($out, strpos($out, "\r\n\r\n") + 4);
                return $out;
            }
            else {
                return true;
            }
        }
        else {
            return false;
        }
    }
}

e.g...

3565
<html>
<head>
...
</html>
0

Upvotes: 1

Views: 383

Answers (2)

J&#252;rgen Thelen
J&#252;rgen Thelen

Reputation: 12737

My guess would be that the server responds with chunked data.

Have a look at RFC2616 Transfer codings and its introduction.

Upvotes: 0

Gavin Ward
Gavin Ward

Reputation: 1022

It is called Chunked Transfer Encoding

It is part of the HTTP 1.1 protocol and you're decoding it in a HTTP 1.0 way. You can just check for the values and trim them if you want. They only show the length of the response so the browser knows it has the complete response.

Also maybe look at file_get_contents

Upvotes: 2

Related Questions