Nathan
Nathan

Reputation: 63

Using regular expressions in views.py Django?

I have a form in my Django app where one field is called url. The user can add a youtube url. On submit, I want to save only the video id.

My views.py look like this:

import re
def video_new(request):
    if request.user.is_authenticated():
        if request.method == "POST":
            form = VideoForm(request.POST)
            if form.is_valid():
                video = form.save(commit=False)
                fullURL = video.url
                youtubeId = re.sub(r'\shttps://www.youtube.com/watch?v=\s', '',fullURL)
                video.url = youtubeId
                video.created_by = request.user
                video.save()
            return redirect('videos:video_detail', video_id=video.pk)
        else:
            form = VideoForm()
    else:
        #if user isn't logged in
        return redirect('login')
    return render(request, 'videos/video_edit.html', {'form': form})

When I output youtubeId in the console, I actually see the full Url.

So I guess I'm not using re.sub correct. How to use it correctly ?

Upvotes: 1

Views: 993

Answers (2)

Moses Koledoye
Moses Koledoye

Reputation: 78556

You don't need the leading and trailing \s in your pattern. Besides YouTube urls cannot be trivilally parsed with one pattern as there is also a short form for every url in the form https://youtu.be/....

Better to use urllib.parse.urlparse for parsing the url:

from urllib.parse import urlparse

def parse_youtube_url(url_str):
    parsed_url = urlparse(url_str)
    if parsed_url.netloc in ('www.youtube.com', 'youtu.be'):
        youtube_id = parsed_url.query.split('&')[0].split('=')[1]
    else:
        raise ValueError('Host is not youtube')
    return youtube_id

url = 'https://www.youtube.com/watch?v=dwyw7esd67'
print(parse_youtube_url(url))
# 'dwyw7esd67'

In Python 2, you'll use from urlparse import urlparse instead.

Upvotes: 1

binpy
binpy

Reputation: 4194

You can using this function to get the id from complex youtube urls.

source: https://gist.github.com/kmonsoor/2a1afba4ee127cce50a0

def get_yt_video_id(url):
    """Returns Video_ID extracting from the given url of Youtube

    Examples of URLs:
      Valid:
        'http://youtu.be/_lOT2p_FCvA',
        'www.youtube.com/watch?v=_lOT2p_FCvA&feature=feedu',
        'http://www.youtube.com/embed/_lOT2p_FCvA',
        'http://www.youtube.com/v/_lOT2p_FCvA?version=3&hl=en_US',
        'https://www.youtube.com/watch?v=rTHlyTphWP0&index=6&list=PLjeDyYvG6-40qawYNR4juzvSOg-ezZ2a6',
        'youtube.com/watch?v=_lOT2p_FCvA',
        'https://www.youtube.com/watch?v=S6q41Rfltsk'

      Invalid:
        'youtu.be/watch?v=_lOT2p_FCvA',
    """

    try:
        # python 3
        from urllib.parse import urlparse, parse_qs
    except ImportError:
        # python 2
        from urlparse import urlparse, parse_qs

    if url.startswith(('youtu', 'www')):
        url = 'http://' + url

    query = urlparse(url)

    if 'youtube' in query.hostname:
        if query.path == '/watch':
            return parse_qs(query.query)['v'][0]
        elif query.path.startswith(('/embed/', '/v/')):
            return query.path.split('/')[2]
    elif 'youtu.be' in query.hostname:
        return query.path[1:]
    else:
        raise ValueError

In your case;

youtubeId = get_yt_video_id(video.url)

Upvotes: 3

Related Questions