xzsnrp
xzsnrp

Reputation: 43

Python's urllib.request.urlopen with disrupted internet connection

I have had some problems with python's urllib and disrupted internet connection: I can never get information from urllib.request.urlopen when calling it first without active internet connection. The following works fine:

 > python
 >>> import urllib.request
 >>> urllib.request.urlopen("http://www.google.com")
 <http.client.HTTPResponse object at 0x7f6f54681438>

 #Now disable internet connection:
 > sudo ip link set enp4s0 down

 >>> urllib.request.urlopen("http://www.google.com")
 Traceback (most recent call last):
   File "/usr/lib/python3.4/urllib/request.py", line 1189, in do_open
     h.request(req.get_method(), req.selector, req.data, headers)
   File "/usr/lib/python3.4/http/client.py", line 1090, in request
     self._send_request(method, url, body, headers)
   File "/usr/lib/python3.4/http/client.py", line 1128, in _send_request
     self.endheaders(body)
   File "/usr/lib/python3.4/http/client.py", line 1086, in endheaders
     self._send_output(message_body)
   File "/usr/lib/python3.4/http/client.py", line 924, in _send_output
     self.send(msg)
   File "/usr/lib/python3.4/http/client.py", line 859, in send
     self.connect()
   File "/usr/lib/python3.4/http/client.py", line 836, in connect
     self.timeout, self.source_address)
   File "/usr/lib/python3.4/socket.py", line 491, in create_connection
     for res in getaddrinfo(host, port, 0, SOCK_STREAM):
   File "/usr/lib/python3.4/socket.py", line 530, in getaddrinfo
     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
 socket.gaierror: [Errno -2] Name or service not known

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
     return opener.open(url, data, timeout)
   File "/usr/lib/python3.4/urllib/request.py", line 455, in open
     response = self._open(req, data)
   File "/usr/lib/python3.4/urllib/request.py", line 473, in _open
     '_open', req)
   File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
     result = func(*args)
   File "/usr/lib/python3.4/urllib/request.py", line 1215, in http_open
     return self.do_open(http.client.HTTPConnection, req)
   File "/usr/lib/python3.4/urllib/request.py", line 1192, in do_open
     raise URLError(err)
 urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>

 #Reenable internet connection:
 > sudo ip link set enp4s0 up #and wait a bit

 >>> urllib.request.urlopen("http://www.google.com")
 <http.client.HTTPResponse object at 0x7f6f5468c898>

So far so good. Now the exact same thing, but without calling urlopen the first time:

 > python
 >>> import urllib.request
 # do not call urlopen before internet is down...


 #Now disable internet connection:
 > sudo ip link set enp4s0 down

 >>> urllib.request.urlopen("http://www.google.com")
 [exactly the same error message as above]

 #Reenable internet connection:
 > sudo ip link set enp4s0 up #and wait a bit

 #Ensure internet connection is up
 > ip link show enp4s0 up
 2: enp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP [...] 


 >>> urllib.request.urlopen("http://www.google.com")
 [exactly the same error message as above]
 #What's the problem? The internet connection IS up

 #However:
 > host www.google.com
 www.google.com has address 173.194.69.104
 [...]
 >>> urllib.request.urlopen("http://173.194.69.104")
 <http.client.HTTPResponse object at 0x7f3116a72e48>

So I suppose it has to do something with DNS(-Caching)?

Finally some information about my system:

 > python --version
 Python 3.4.1
 > uname -a
 Linux charon 3.15.3-1-ARCH #1 SMP PREEMPT Tue Jul 1 07:32:45 CEST 2014 x86_64 GNU/Linux

Sorry about the weird formatting. I mixed up 'normal' (prefixed with '>') and python (prefixed with '>>>') shell command to make the exact command sequence clear (which obviously happened in different terminal).

Upvotes: 4

Views: 2516

Answers (1)

Jonas Sch&#228;fer
Jonas Sch&#228;fer

Reputation: 20718

You are running in a well-known glibc problem. One can argue whether this is a mis-use of glibc or whether glibc is doing something wrong here. res_init is not part of POSIX, but a interface originating from BSD, so it is hard to do right in a platform-independent manner.

There seems to be no bug report against python for this problem, so you might want to file one.

As a workaround, you could use ctypes to make a call to res_init yourself, but I don’t know how to do this exactly off the top of my head.

Upvotes: 2

Related Questions