Reputation: 131
I have the following code, which is supposed to check if an entered url is valid:
#!/usr/bin/env python3
import sys
import urllib.parse
# ...
def checkValidURL(someURL):
try:
parsed_url = urllib.parse.urlparse(someURL)
isURL = True
except ValueError:
print("Invalid URL!")
sys.exit(0)
# ...
if __name__ == "__main__":
checkValidURL(someURL)
If an invalid URL is entered, for example: someURL="http://ijfjiör@@@a:43244434::"
it should raise a ValueError
as described here:
Characters in the netloc attribute that decompose under NFKC normalization (as used by the IDNA encoding) into any of /, ?, #, @, or : will raise a ValueError. If the URL is decomposed before parsing, no error will be raised.
However, no exception is raised and the URL seems to be valid.
Is there anything I am doing wrong or is there any other way to check the validity of an URL?
Upvotes: 3
Views: 3562
Reputation: 189357
Your URL doesn't decompose into a string which contains a prohibited character, so the quotation is not at all relevant here.
The language in the quote is strictly about disallowing the use of internationalized domain name encoding like http://xn--foo/
to produce something like http://?/
and since you are not doing that here, no ValueError
is generated or indeed to be expected.
(Sorry, not in a place where I can create a genuine working example.)
Upvotes: 3
Reputation: 954
I'm not sure if the python validators module would be a better fit for your use case?
$ python3
>>>
>>>
>>> import validators
>>>
>>> validators.url("http://ijfjiör@@@a:43244434::")
ValidationFailure(func=url, args={'value': 'http://ijfjiör@@@a:43244434::', 'public': False})
>>>
>>> validators.url("http://www.google.com:8080/")
True
>>>
>>> validators.url("http://www.我愛你.com:8080/你好")
True
Upvotes: 1