nickbusted
nickbusted

Reputation: 1109

Python - Django URLField

I have a Django project which contains models with URLField. For the same project, I am writing non-Django Python script, and would like to normalise urls.

url1 = //habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg
url2 = http://habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg
url3 = www.habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg

How can I normalise the urls? Is it possible to cast a url to an instance of Django's URLField? Ideally, I would prefer if all urls would be in the same format as url2.

Thanks!

Upvotes: 0

Views: 715

Answers (2)

Felix D.
Felix D.

Reputation: 2220

I'm not sure what you want to achieve but to normalize your urls to look like the second one you could just use a regular expression and do a substitution with the regex module.

formatted_url = re.sub(r'^((http\:|)//|www\.)?(?P<url>.*)', r'http://\g<url>', your_url)

That would take any url of the form //blabla.com, www.blabla.com and http://blabla.com and return http://blabla.com

Here's an example of how it could be used

import re

def getNormalized(url):
    """Returns the normalized version of a url"""
    return re.sub(r'^((http\:|)//|www\.)?(?P<url>.*)',
                        r'http://\g<url>',url)

url1 = '//habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg'
url2 = 'http://habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg'
url3 = 'www.habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg'

formatted_url1 = getNormalized(url1)
formatted_url2 = getNormalized(url2)
formatted_url3 = getNormalized(url3)

print(formatted_url1)
# http://habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg
print(formatted_url2)
# http://habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg
print(formatted_url3)
# http://habrastorage.org/files/fa9/f33/091/fa9f330913c0462c8f576393f4135ec6.jpg

Upvotes: 1

codingjoe
codingjoe

Reputation: 1257

If you want to know how it's done check the code. Here you will find the to_python function that formats the return string.

https://github.com/django/django/blob/master/django/forms/fields.py#L705-L738

It uses urlparse or rather django's own copy for python2 and 3 support.

Upvotes: 0

Related Questions