Ben
Ben

Reputation: 16553

Python urlparse, correct or incorrect?

Python's urlparse function parses an url into six components (scheme, netloc, path and others stuff)

Now I've found that parsing "example.com/path/file.ext" return no netloc but a path "example.com/path/file.ext".

Should't it be netloc = "example.com" and path = "/path/file.ext"?

Do we really need a "://" to determine wether or not a netloc exists?

Python's ticket: http://bugs.python.org/issue8284

Upvotes: 4

Views: 1289

Answers (2)

Vinay Sajip
Vinay Sajip

Reputation: 99415

Without the scheme://, there's no guarantee that example.com is a domain. You could have a directory called example.com. Similarly, you could have a url 'omfgroflmao/path/file.ext', how would you know if 'omfgroflmao' is a machine on the local network (i.e. a netloc) or whether it's meant to be a path component?

I can't see that the Python code is actually wrong, but perhaps the documentation needs to spell out explicitly the behaviour in such ambiguous circumstances (I haven't checked).

Upvotes: 7

Messa
Messa

Reputation: 25201

example.com/path/file.ext is not URL. It's just some string. For example if you put <a href="example.com/path/file.ext"> into HTML page, it will not link to http://example.com/path/file.ext. It's just a shortcut provided by web browsers that you do not have to prepend the http://. You can not even use such URL as parameter for urllib2.urlopen() and similar functions.

Upvotes: 2

Related Questions