sirvon
sirvon

Reputation: 2625

Can't unshorten bit.ly urls?

I'm using the code in this stackoverflow post to unshorten urls...

import httplib
import urlparse

def unshorten_url(url):
    parsed = urlparse.urlparse(url)
    h = httplib.HTTPConnection(parsed.netloc)
    resource = parsed.path
    if parsed.query != "":
        resource += "?" + parsed.query
    h.request('HEAD', resource )
    response = h.getresponse()
    if response.status/100 == 3 and response.getheader('Location'):
        return unshorten_url(response.getheader('Location')) # changed to process chains of short urls
    else:
        return url

All shortened links get unshortned 'cept for newly created bit.ly urls.

I get this error:

>>> unshorten_url("bit.ly/1atTViN")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in unshorten_url
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 955, in request
    self._send_request(method, url, body, headers)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 989, in _send_request
    self.endheaders(body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 951, in endheaders
    self._send_output(message_body)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 811, in _send_output
    self.send(msg)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 773, in send
    self.connect()
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 754, in connect
    self.timeout, self.source_address)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
    raise err
socket.error: [Errno 61] Connection refused

What gives?

Upvotes: 1

Views: 616

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1122392

You forgot to include the URL scheme:

unshorten_url("http://bit.ly/1atTViN")

Note the http:// there, that is important. Without it, the URL is not parsed correctly:

>>> import urlparse
>>> urlparse.urlparse('bit.ly/1atTViN')
ParseResult(scheme='', netloc='', path='bit.ly/1atTViN', params='', query='', fragment='')
>>> urlparse.urlparse('http://bit.ly/1atTViN')
ParseResult(scheme='http', netloc='bit.ly', path='/1atTViN', params='', query='', fragment='')

See how the netloc parameter is empty when no http:// is included; you end up trying to connect to your own machine instead, and you are not running a webserver so the connection is refused.

Upvotes: 3

Karol Sikora
Karol Sikora

Reputation: 522

Probably bit.ly is refusing connections from tools like httplib. You can try to change user agent like this:

h.putheader('User-Agent','Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:1.7.10) Gecko/20050717 Firefox/1.0.6')

Upvotes: 0

Related Questions