user1985563
user1985563

Reputation: 197

httplib.InvalidURL: nonnumeric port:

i'm trying to do a script which check if many urls exists:

import httplib

with open('urls.txt') as urls:
    for url in urls:
        connection = httplib.HTTPConnection(url)
        connection.request("GET")
        response = connection.getresponse()
        if response.status == 200:
            print '[{}]: '.format(url), "Up!"

But I got this error:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    connection = httplib.HTTPConnection(url)
  File "/usr/lib/python2.7/httplib.py", line 693, in __init__
    self._set_hostport(host, port)
  File "/usr/lib/python2.7/httplib.py", line 721, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: '//globo.com/galeria/amazonas/a.html

What's wrong?

Upvotes: 14

Views: 45878

Answers (3)

Friyank
Friyank

Reputation: 479

nonnumeric port:

Solution :

http.client.HTTPSConnection("api.cognitive.microsofttranslator.com")

Remove "https://" from Service URL or Endpoint and it will work.

https://appdotpy.wordpress.com/2020/07/04/errorsolved-nonnumeric-port/

Upvotes: 2

Atul Arvind
Atul Arvind

Reputation: 16733

This might be a simple solution, here

connection = httplib.HTTPConnection(url)

you are using the httpconnection so no need to give url like, http://OSMQuote.com but instead of that you need to give OSMQuote.com.

In short remove the http:// and https:// from your URL, because the httplib is considering : as a port number and the port number must be numeric,

Hope this helps!

Upvotes: 32

tom
tom

Reputation: 19153

httplib.HttpConnection takes the host and port of the remote URL in its constructor, and not the whole URL.

For your use case, it's easier to use urllib2.urlopen.

import urllib2

with open('urls.txt') as urls:
    for url in urls:
        try:
            r = urllib2.urlopen(url)
        except urllib2.URLError as e:
            r = e
        if r.code in (200, 401):
            print '[{}]: '.format(url), "Up!"
        elif r.code == 404:
            print '[{}]: '.format(url), "Not Found!" 

Upvotes: 9

Related Questions