Nik
Nik

Reputation: 1528

ValueError: unknown url type in urllib2, though the url is fine if opened in a browser

Basically, I am trying to download a URL using urllib2 in python.

the code is the following:

import urllib2
req = urllib2.Request('www.tattoo-cover.co.uk')
req.add_header('User-agent','Mozilla/5.0')
result = urllib2.urlopen(req)

it outputs ValueError and the program crushes for the URL in the example. When I access the url in a browser, it works fine.

Any ideas how to handle the problem?

UPDATE:

thanks for Ben James and sth the problem is detected => add 'http://'

Now the question is refined: Is it possible to handle such cases automatically with some builtin function or I have to do error handling with subsequent string concatenation?

Upvotes: 27

Views: 54269

Answers (4)

You can use the method urlparse from urllib (Python 3) to check the presence of an addressing scheme (http, https, ftp) and concatenate the scheme in case it is not present:

In [1]: from urllib.parse import urlparse
    ..: 
    ..: url = 'www.myurl.com'
    ..: if not urlparse(url).scheme:
    ..:     url = 'http://' + url
    ..: 
    ..: url
Out[1]: 'http://www.myurl.com'

Upvotes: 1

inetphantom
inetphantom

Reputation: 2617

You can use the urlparse function for that I think :

Python User Documentation

Upvotes: 0

Ben James
Ben James

Reputation: 125119

When you enter a URL in a browser without the protocol, it defaults to HTTP. urllib2 won't make that assumption for you; you need to prefix it with http://.

Upvotes: 46

sth
sth

Reputation: 229563

You have to use a complete URL including the protocol, not just specify a host name.

The correct URL would be http://www.tattoo-cover.co.uk/.

Upvotes: 7

Related Questions