Ryan
Ryan

Reputation: 3582

Curl Timeout in PHP (Works fine in CLI)

I'm experiencing an issue where I'm running two websites locally on my Windows machine (a.ryan and b.ryan). The issue I'm experiencing does not happen on the live environment (running CentOS7). A script in b.ryan makes a CURL request to a.ryan:

* Rebuilt URL to: http://a.ryan/
* Hostname a.ryan was found in DNS cache
*   Trying 192.168.0.64...
* TCP_NODELAY set
* Connected to a.ryan (192.168.0.64) port 80 (#0)
> GET / HTTP/1.1
Host: a.ryan
User-Agent: Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)
Accept: */*

* Operation timed out after 10000 milliseconds with 0 bytes received
* Curl_http_done: called premature == 1
* Closing connection 0

As you can see - the connection times out. I've tried longer duration's here (with the same results) although in reality it should actually be near instant.

I'm currently using the following function:

function getHTML($url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_IPRESOLVE, CURL_IPRESOLVE_V4);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($ch, CURLOPT_SSLVERSION, 3);
    curl_setopt($ch, CURLOPT_PROXY, '');
    curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, false);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0)');
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_STDERR, fopen('curl.txt', 'w+'));
    $tmp = curl_exec($ch);
    curl_close($ch);
    if ($tmp != false) {
        return $tmp;
    }
}

Admittedly there are a lot of options here that may not need to be present - however this is as a result of trying multiple solutions found online. Just to clarify, I get exactly the same response posted above when I use:

function getHTML($url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_VERBOSE, true);
    curl_setopt($ch, CURLOPT_STDERR, fopen('curl.txt', 'w+'));
    $tmp = curl_exec($ch);
    curl_close($ch);
    if ($tmp != false) {
        return $tmp;
    }
}

Hopefully that gives an idea of the settings I've tried with the PHP Curl method to get around this issue.

When the I run Curl on the Command Line, it works fine:

* Rebuilt URL to: a.ryan/
*   Trying 192.168.0.64...
* TCP_NODELAY set
* Connected to a.ryan (192.168.0.64) port 80 (#0)
> GET / HTTP/1.1
> Host: a.ryan
> User-Agent: curl/7.55.1
> Accept: */*
>
< HTTP/1.1 302 Moved Temporarily
< Server: nginx/1.12.0
< Date: Wed, 01 May 2019 11:34:12 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Powered-By: PHP/5.6.30
< Set-Cookie: PHPSESSID=9898j4cia9s888jn24gr4be8m5; path=/
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< location: /home
<
* Connection #0 to host a.ryan left intact

I've also disabled all IPv6 configuration on my network interfaces on this machine, as I was originally under the impression that this issue was caused by IPv6 resolutions instead of IPv4, but it made no difference.

Here's a copy of my hosts file if it helps.

# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#102.54.94.97   rhino.acme.com  # source server
#38.25.63.10    x.acme.com  # x client host
# localhost name resolution is handled within DNS itself.
#127.0.0.1  localhost
#::1    localhost
127.0.0.1   localhost.localdomain localhost MyPCName
127.0.0.1   a.ryan
127.0.0.1   b.ryan

EDIT

Forgot to mention - If I run the script in the CLI, it runs fine also. So it's actually specific to running the script through the browser. (Using Winginx to serve the website)

Upvotes: 2

Views: 2167

Answers (1)

hanshenrik
hanshenrik

Reputation: 21455

i suppose it is possible that the webserver (or perhaps stupid firewall heuristics configured to block malicious web vulnerability scanners?) has been configured to block requests that are obviously lying about the user-agent, because the request that doesn't work is lying about being Internet Explorer 10, and it's quite obviously not, a real Internet Explorer GET request looks like

GET / HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Accept-Language: nb-NO
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: 127.0.0.1:9999
Connection: Keep-Alive

which has quite a few differences from your forged request, while the request that actually does work is truthfully claiming to be curl/7.55.1

.. what happens if you change the User-Agent to

curl_setopt($ch, CURLOPT_USERAGENT, 'libcurl/'.(curl_version()['version']).' PHP/'.PHP_VERSION);

? or even just

curl_setopt($ch, CURLOPT_USERAGENT, 'curl/7.55.1');

?

Upvotes: 0

Related Questions