Reputation: 22973
I have a list of URL's
I am using the following to retrieve their contents:
for url in url_list:
req = urllib2.Request(url)
resp = urllib2.urlopen(req, timeout=5)
resp_page = resp.read()
print resp_page
When there is a timeout, the program just crashes. I just want to read the next URL if there is a socket.timeout: timed out
. How to do this?
Thanks
Upvotes: 2
Views: 6092
Reputation: 3125
Although there already is an answer, I'd like to point out that URLlib2
might not be the sole responsible with this behavior.
As pointed out here (and as it also seems based on the problem description), the exception may belong to the socket
library.
In that case just add another except
:
import socket
try:
resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
print "Bad URL or timeout"
except socket.timeout:
print "socket timeout"
Upvotes: 7
Reputation: 818
Sounds like you just need to catch the timeout exception. I don't get a socket.timeout message that you do.
req = urllib2.Request("http://127.0.0.2")
try:
resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
print "Timeout!"
Obviously, you need to have a URL that will actually timeout (127.0.0.2 may not on your box).
Upvotes: 1
Reputation: 176740
I'm going to go ahead and assume that by "crashes" you mean "raises a URLError", as described by the urllib2.urlopen
docs. See the Errors and Exceptions section of the Python Tutorial.
for url in url_list:
req = urllib2.Request(url)
try:
resp = urllib2.urlopen(req, timeout=5)
except urllib2.URLError:
print "Bad URL or timeout"
continue # skips to the next iteration of the loop
resp_page = resp.read()
print resp_page
Upvotes: 1