Jenz
Jenz

Reputation: 8369

Regex for youtube URL

I am using the following regex for validating youtube video share url's.

var valid = /^(http\:\/\/)?(youtube\.com|youtu\.be)+$/;
alert(valid.test(url));
return false;

I want the regex to support the following URL formats:

http://youtu.be/cCnrX1w5luM  
http://youtube/cCnrX1w5luM  
www.youtube.com/cCnrX1w5luM  
youtube/cCnrX1w5luM  
youtu.be/cCnrX1w5luM   

I tried different regex but I am not getting a suitable one for share links. Can anyone help me to solve this.

Upvotes: 64

Views: 91477

Answers (12)

phuc77
phuc77

Reputation: 6915

Here's a regex I use to match and capture the important bits of YouTube URLs with video codes:

^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(?:-nocookie)?\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|live\/|v\/)?)([\w\-]+)(\S+)?$

It works with the following URLs:

https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
//www.youtube.com/watch?v=DFYRQ_zQ-gk
www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
//youtube.com/watch?v=DFYRQ_zQ-gk
youtube.com/watch?v=DFYRQ_zQ-gk

https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
//m.youtube.com/watch?v=DFYRQ_zQ-gk
m.youtube.com/watch?v=DFYRQ_zQ-gk

https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
//www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US

https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
//www.youtube.com/embed/DFYRQ_zQ-gk
www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
//youtube.com/embed/DFYRQ_zQ-gk
youtube.com/embed/DFYRQ_zQ-gk

https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
//www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://youtube-nocookie.com/embed/DFYRQ_zQ-gk
//youtube-nocookie.com/embed/DFYRQ_zQ-gk
youtube-nocookie.com/embed/DFYRQ_zQ-gk

https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
//youtu.be/DFYRQ_zQ-gk
youtu.be/DFYRQ_zQ-gk

https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk

https://www.youtube.com/live/sMbxjePPmkw?feature=share

The captured groups are:

  1. protocol
  2. subdomain
  3. domain
  4. path
  5. video code
  6. query string

https://regex101.com/r/vHEc61/1

Upvotes: 117

Kaligula
Kaligula

Reputation: 39

This is what I use in my scripts:

^(?:(?:https?:)?\/\/)?(?:(?:(?:www|m(?:usic)?)\.)?youtu(?:\.be|be\.com)\/(?:shorts\/|live\/|v\/|e(?:mbed)?\/|watch(?:\/|\?(?:\S+=\S+&)*v=)|oembed\?url=https?%3A\/\/(?:www|m(?:usic)?)\.youtube\.com\/watch\?(?:\S+=\S+&)*v%3D|attribution_link\?(?:\S+=\S+&)*u=(?:\/|%2F)watch(?:\?|%3F)v(?:=|%3D))?|www\.youtube-nocookie\.com\/embed\/)([\w-]{11})[\?&#]?\S*$

It matches:

    various protocols (and lack of one):
https://www.youtube.com/watch?v=U9t-slLl30E
http://www.youtube.com/watch?v=U9t-slLl30E
//www.youtube.com/watch?v=U9t-slLl30E
www.youtube.com/watch?v=U9t-slLl30E

    and for each protocol
    various domains:
www.youtube.com/watch?v=U9t-slLl30E
m.youtube.com/watch?v=U9t-slLl30E
music.youtube.com/watch?v=OD3F7J2PeYU
youtube.com/watch?v=U9t-slLl30E
www.youtube-nocookie.com/embed/U9t-slLl30E
youtu.be/U9t-slLl30E

    and for each domain (despite -nocookie)
    various paths:
youtube.com/watch?v=U9t-slLl30E
youtube.com/watch/U9t-slLl30E
youtube.com/v/U9t-slLl30E
youtube.com/embed/U9t-slLl30E
youtube.com/e/U9t-slLl30E
youtube.com/live/9UMxZofMNbA
youtube.com/shorts/gOcxEMJSksg
youtube.com/oembed?url=http%3A//www.youtube.com/watch?v%3DU9t-slLl30E&format=json
youtube.com/attribution_link?a=JdfC0C9V6ZI&u=%2Fwatch%3Fv%3DU9t-slLl30E%26feature%3Dshare
youtube.com/attribution_link?a=8g8kPrPIi-ecwIsS&u=/watch%3Fv%3DU9t-slLl30E%26feature%3Dem-uploademail

    and for each path
    various parameters:
youtube.com/watch?v=U9t-slLl30E
youtube.com/watch?v=U9t-slLl30E&feature=shared
youtube.com/watch?v=U9t-slLl30E&t=1m02s
youtube.com/watch?v=U9t-slLl30E&lc=UgyYsn3aIQWSA19Esi54AaABAg
youtube.com/watch?v=Lo2qQmj0_h4&list=PLmXxqSJJq-yVWpRFGImHYZBQTuBGLjG4t&index=5&pp=iAQB8AUB
    in various order:
youtube.com/watch?feature=shared&v=U9t-slLl30E

    but not these:
(wrong ID)
youtube.com/watch?v=U$t-slLl30E
(too short ID)
youtube.com/watch?v=U9t-slLl30&t=10
(wrong or deprecated paths)
youtube.com/GitHub?v=U9t-slLl30E
youtube.com/?v=U9t-slLl30E
youtube.com/?vi=U9t-slLl30E
youtube.com/?feature=player_embedded&v=U9t-slLl30E
youtube.com/watch?vi=U9t-slLl30E
youtube.com/vi/U9t-slLl30E
(www.youtube-nocookie.com/embed/ only!)
youtube-nocookie.com/embed/U9t-slLl30E
www.youtube-nocookie.com/watch?v=U9t-slLl30E
http://www.youtube-nocookie.com/v/U9t-slLl30E?version=3&hl=en_US&rel=0
(playlist)
youtube.com/playlist?list=PLmXxqSJJq-yVWpRFGImHYZBQTuBGLjG4t

Try it https://regex101.com/r/7upRfP/

Also catches the video ID. If you want you can restrict the video ID further with Glenn's answer instead of ([\w-]{11}).

I'll try to keep this updated on gist https://gist.github.com/Kaligula0/1ff5f4e2cf1f351daeca3450f71fdcb5.

Upvotes: 0

user3342816
user3342816

Reputation: 1263

Modified from phuk using

  • capturing only-token / using non-capturing groups for all but token
  • multi-line with comments /x or here @x x(PCRE_EXTENDED)
  • using @ as delimiters as to be able to use / without escape.
  • non-escape on - at end of character lists.
    E.g. [\w-] not [\w\-]

Example at regex101 with an experimental inclusion of # Possible: oembed?url=...v=:

https://regex101.com/r/0pZCmF/1

$yttok_regex = <<<EOR
@^

# Possible: http://
#       https://
#       //
(?:(?:https?:)?//)?

# Possible: www.
#       m.
(?:(?:www|m)\.)?

# Possible: youtube.com
#       youtube-nocookie.com
#       youtu.be
(?:(?:youtube(?:-nocookie)?\.com|youtu.be))?

# Possible: /[a-zA-Z0-9_-]+?v=
#       /embed/
#       /v/
(?:/(?:[\w-]+\?v=|embed/|v/)?)?

# TOKEN:    [a-zA-Z0-9_-]
([\w-]+)

# Possible:
#       Anything not space+
(?:\S+)?

# EOF pattern with x(PCRE_EXTENDED) flag:
$@x
EOR;

Optionally use:

# TOKEN:    [a-zA-Z0-9_-]
([\w-]{11})

To match only 11-char long tokens.

Upvotes: 0

Bernhard Barker
Bernhard Barker

Reputation: 55589

  • You're missing www in your regex
  • The second \. should optional if you want to match both youtu.be and youtube (but I didn't change this since just youtube isn't actually a valid domain - see note below)
  • + in your regex allows for one or more of (youtube\.com|youtu\.be), not one or more wild-cards.
    You need to use a . to indicate a wild-card, and + to indicate you want one or more of them.

Try:

^(https?\:\/\/)?(www\.youtube\.com|youtu\.be)\/.+$

Live demo.

If you want it to match URLs with or without the www., just make it optional:

^(https?\:\/\/)?((www\.)?youtube\.com|youtu\.be)\/.+$

Live demo.

Invalid alternatives:

If you want www.youtu.be/... to also match (at the time of writing, this doesn't appear to be a valid URL format), put the optional www. outside the brackets:

^(https?\:\/\/)?(www\.)?(youtube\.com|youtu\.be)\/.+$

youtube/cCnrX1w5luM (with or without http://) isn't a valid URL, but the question explicitly mentions that the regex should support that. To include this, replace youtu\.be with youtu\.?be in any regex above. Live demo.

Upvotes: 59

https://regexr.com/62kgd

^((http|https)\:\/\/)?(www\.youtube\.com|youtu\.?be)\/((watch\?v=)?([a-zA-Z0-9]{11}))(&.*)*$

https://www.youtube.com/watch?v=YPz9zqakRbk

https://www.youtube.com/watch?v=YPz9zqakRbk&t=11

http://youtu.be/cCnrX1w5luM&y=12

http://youtu.be/cCnrX1w5luM

http://youtube/cCnrXswsluM

www.youtube.com/cCnrX1w5luM

youtube/cCnrX1w5luM

Upvotes: 1

zmanplex
zmanplex

Reputation: 61

I took one of the answers from here and added support for a few edge cases that I noticed in my dataset. This should work for pretty much any valid url.

^(?:https?:)?(?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]{7,15})(?:[\?&][a-zA-Z0-9\_-]+=[a-zA-Z0-9\_-]+)*(?:[&\/\#].*)?$

Upvotes: 6

xeon927
xeon927

Reputation: 313

I know I'm like 2 years late to the party, but I was needing to write something up anyway, and seems to fit every test case that I can throw at it. Should be able to reference the first match ($1) to get the ID. Matches the http, https, www and non-www, youtube.com, youtu.be, /watch? and /watch.php? on youtube.com (youtu.be does not use these), and it supports matching even when there are other variables in the URL string (?t= for time, ?list= for playlists, etc).

(?:https?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]+)

Upvotes: 20

Akash Jain
Akash Jain

Reputation: 1052

I tried this one and it works fine for me.

(?:http(?:s)?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'<> #]+)

You can check here https://regex101.com/r/Kvk0nB/1

Upvotes: 1

Joey Mason
Joey Mason

Reputation: 707

Format for YouTube videos has changed. This regex works for all cases:

^(http(s)??\:\/\/)?(www\.)?((youtube\.com\/watch\?v=)|(youtu.be\/))([a-zA-Z0-9\-_])+

Tests here.

Upvotes: 12

yusuf
yusuf

Reputation: 3646

Based on so many other regex; this is the best I have got:

((http(s)?:\/\/)?)(www\.)?((youtube\.com\/)|(youtu.be\/))[\S]+

Test: http://regexr.com/3bga2

Upvotes: 5

Nafiul Islam
Nafiul Islam

Reputation: 82460

Try this:

((http://)?)(www\.)?((youtube\.com/)|(youtu\.be)|(youtube)).+

http://regexr.com?36o7a

Upvotes: 3

rolandvarga
rolandvarga

Reputation: 126

Check this pattern instead:

r'(?i)(http.//|https.//)*[A-Za-z0-9._%+-]+\.\w+'

Upvotes: -5

Related Questions