Reputation: 43
The following two lines of code hangs forever:
import urllib2
urllib2.urlopen('https://www.5giay.vn/', timeout=5)
This is with python2.7, and I have no http_proxy or any other env variables set. Any other website works fine. I can also wget the site without any issue. What could be the issue?
Upvotes: 4
Views: 2671
Reputation: 879143
If you run
import urllib2
url = 'https://www.5giay.vn/'
urllib2.urlopen(url, timeout=1.0)
wait for a few seconds, and then use C-c to interrupt the program, you'll see
File "/usr/lib/python2.7/ssl.py", line 260, in read
return self._sslobj.read(len)
KeyboardInterrupt
This shows that the program is hanging on self._sslobj.read(len)
.
SSL timeouts raise socket.timeout
.
You can control the delay before socket.timeout is raised by calling
socket.setdefaulttimeout(1.0)
.
For example,
import urllib2
import socket
socket.setdefaulttimeout(1.0)
url = 'https://www.5giay.vn/'
try:
urllib2.urlopen(url, timeout=1.0)
except IOError as err:
print('timeout')
% time script.py
timeout
real 0m3.629s
user 0m0.020s
sys 0m0.024s
Note that the requests module succeeds here although urllib2
did not:
import requests
r = requests.get('https://www.5giay.vn/')
How to enforce a timeout on the entire function call:
socket.setdefaulttimeout
only affects how long Python waits before an exception is raised if the server has not issued a response.
Neither it nor urlopen(..., timeout=...)
enforce a time limit on the entire function call.
To do that, you could use eventlets, as shown here.
If you don't want to install eventlets
, you could use multiprocessing
from the standard library; though this solution will not scale as well as an asynchronous solution such as the one eventlets
provides.
import urllib2
import socket
import multiprocessing as mp
def timeout(t, cmd, *args, **kwds):
pool = mp.Pool(processes=1)
result = pool.apply_async(cmd, args=args, kwds=kwds)
try:
retval = result.get(timeout=t)
except mp.TimeoutError as err:
pool.terminate()
pool.join()
raise
else:
return retval
def open(url):
response = urllib2.urlopen(url)
print(response)
url = 'https://www.5giay.vn/'
try:
timeout(5, open, url)
except mp.TimeoutError as err:
print('timeout')
Running this will either succeed or timeout in about 5 seconds of wall clock time.
Upvotes: 5