Reputation: 8369
I am using the following regex for validating youtube video share url's.
var valid = /^(http\:\/\/)?(youtube\.com|youtu\.be)+$/;
alert(valid.test(url));
return false;
I want the regex to support the following URL formats:
http://youtu.be/cCnrX1w5luM
http://youtube/cCnrX1w5luM
www.youtube.com/cCnrX1w5luM
youtube/cCnrX1w5luM
youtu.be/cCnrX1w5luM
I tried different regex but I am not getting a suitable one for share links. Can anyone help me to solve this.
Upvotes: 64
Views: 91477
Reputation: 6915
Here's a regex I use to match and capture the important bits of YouTube URLs with video codes:
^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(?:-nocookie)?\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|live\/|v\/)?)([\w\-]+)(\S+)?$
It works with the following URLs:
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
//www.youtube.com/watch?v=DFYRQ_zQ-gk
www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
//youtube.com/watch?v=DFYRQ_zQ-gk
youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
//m.youtube.com/watch?v=DFYRQ_zQ-gk
m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
//www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
//www.youtube.com/embed/DFYRQ_zQ-gk
www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
//youtube.com/embed/DFYRQ_zQ-gk
youtube.com/embed/DFYRQ_zQ-gk
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
//www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://youtube-nocookie.com/embed/DFYRQ_zQ-gk
//youtube-nocookie.com/embed/DFYRQ_zQ-gk
youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
//youtu.be/DFYRQ_zQ-gk
youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
https://www.youtube.com/live/sMbxjePPmkw?feature=share
The captured groups are:
https://regex101.com/r/vHEc61/1
Upvotes: 117
Reputation: 39
This is what I use in my scripts:
^(?:(?:https?:)?\/\/)?(?:(?:(?:www|m(?:usic)?)\.)?youtu(?:\.be|be\.com)\/(?:shorts\/|live\/|v\/|e(?:mbed)?\/|watch(?:\/|\?(?:\S+=\S+&)*v=)|oembed\?url=https?%3A\/\/(?:www|m(?:usic)?)\.youtube\.com\/watch\?(?:\S+=\S+&)*v%3D|attribution_link\?(?:\S+=\S+&)*u=(?:\/|%2F)watch(?:\?|%3F)v(?:=|%3D))?|www\.youtube-nocookie\.com\/embed\/)([\w-]{11})[\?&#]?\S*$
It matches:
various protocols (and lack of one):
https://www.youtube.com/watch?v=U9t-slLl30E
http://www.youtube.com/watch?v=U9t-slLl30E
//www.youtube.com/watch?v=U9t-slLl30E
www.youtube.com/watch?v=U9t-slLl30E
and for each protocol
various domains:
www.youtube.com/watch?v=U9t-slLl30E
m.youtube.com/watch?v=U9t-slLl30E
music.youtube.com/watch?v=OD3F7J2PeYU
youtube.com/watch?v=U9t-slLl30E
www.youtube-nocookie.com/embed/U9t-slLl30E
youtu.be/U9t-slLl30E
and for each domain (despite -nocookie)
various paths:
youtube.com/watch?v=U9t-slLl30E
youtube.com/watch/U9t-slLl30E
youtube.com/v/U9t-slLl30E
youtube.com/embed/U9t-slLl30E
youtube.com/e/U9t-slLl30E
youtube.com/live/9UMxZofMNbA
youtube.com/shorts/gOcxEMJSksg
youtube.com/oembed?url=http%3A//www.youtube.com/watch?v%3DU9t-slLl30E&format=json
youtube.com/attribution_link?a=JdfC0C9V6ZI&u=%2Fwatch%3Fv%3DU9t-slLl30E%26feature%3Dshare
youtube.com/attribution_link?a=8g8kPrPIi-ecwIsS&u=/watch%3Fv%3DU9t-slLl30E%26feature%3Dem-uploademail
and for each path
various parameters:
youtube.com/watch?v=U9t-slLl30E
youtube.com/watch?v=U9t-slLl30E&feature=shared
youtube.com/watch?v=U9t-slLl30E&t=1m02s
youtube.com/watch?v=U9t-slLl30E&lc=UgyYsn3aIQWSA19Esi54AaABAg
youtube.com/watch?v=Lo2qQmj0_h4&list=PLmXxqSJJq-yVWpRFGImHYZBQTuBGLjG4t&index=5&pp=iAQB8AUB
in various order:
youtube.com/watch?feature=shared&v=U9t-slLl30E
but not these:
(wrong ID)
youtube.com/watch?v=U$t-slLl30E
(too short ID)
youtube.com/watch?v=U9t-slLl30&t=10
(wrong or deprecated paths)
youtube.com/GitHub?v=U9t-slLl30E
youtube.com/?v=U9t-slLl30E
youtube.com/?vi=U9t-slLl30E
youtube.com/?feature=player_embedded&v=U9t-slLl30E
youtube.com/watch?vi=U9t-slLl30E
youtube.com/vi/U9t-slLl30E
(www.youtube-nocookie.com/embed/ only!)
youtube-nocookie.com/embed/U9t-slLl30E
www.youtube-nocookie.com/watch?v=U9t-slLl30E
http://www.youtube-nocookie.com/v/U9t-slLl30E?version=3&hl=en_US&rel=0
(playlist)
youtube.com/playlist?list=PLmXxqSJJq-yVWpRFGImHYZBQTuBGLjG4t
Try it https://regex101.com/r/7upRfP/
Also catches the video ID. If you want you can restrict the video ID further with Glenn's answer instead of ([\w-]{11})
.
I'll try to keep this updated on gist https://gist.github.com/Kaligula0/1ff5f4e2cf1f351daeca3450f71fdcb5.
Upvotes: 0
Reputation: 1263
Modified from phuk using
only-token
/ using non-capturing groups for all but token/x
or here @x
x(PCRE_EXTENDED)@
as delimiters as to be able to use /
without escape.-
at end of character lists.[\w-]
not [\w\-]
Example at regex101 with an experimental inclusion of # Possible: oembed?url=...v=
:
https://regex101.com/r/0pZCmF/1
$yttok_regex = <<<EOR
@^
# Possible: http://
# https://
# //
(?:(?:https?:)?//)?
# Possible: www.
# m.
(?:(?:www|m)\.)?
# Possible: youtube.com
# youtube-nocookie.com
# youtu.be
(?:(?:youtube(?:-nocookie)?\.com|youtu.be))?
# Possible: /[a-zA-Z0-9_-]+?v=
# /embed/
# /v/
(?:/(?:[\w-]+\?v=|embed/|v/)?)?
# TOKEN: [a-zA-Z0-9_-]
([\w-]+)
# Possible:
# Anything not space+
(?:\S+)?
# EOF pattern with x(PCRE_EXTENDED) flag:
$@x
EOR;
Optionally use:
# TOKEN: [a-zA-Z0-9_-]
([\w-]{11})
To match only 11-char long tokens.
Upvotes: 0
Reputation: 55589
www
in your regex\.
should optional if you want to match both youtu.be
and youtube
(but I didn't change this since just youtube
isn't actually a valid domain - see note below)+
in your regex allows for one or more of (youtube\.com|youtu\.be)
, not one or more wild-cards..
to indicate a wild-card, and +
to indicate you want one or more of them.Try:
^(https?\:\/\/)?(www\.youtube\.com|youtu\.be)\/.+$
If you want it to match URLs with or without the www.
, just make it optional:
^(https?\:\/\/)?((www\.)?youtube\.com|youtu\.be)\/.+$
Invalid alternatives:
If you want www.youtu.be/...
to also match (at the time of writing, this doesn't appear to be a valid URL format), put the optional www.
outside the brackets:
^(https?\:\/\/)?(www\.)?(youtube\.com|youtu\.be)\/.+$
youtube/cCnrX1w5luM
(with or without http://
) isn't a valid URL, but the question explicitly mentions that the regex should support that. To include this, replace youtu\.be
with youtu\.?be
in any regex above. Live demo.
Upvotes: 59
Reputation: 11
^((http|https)\:\/\/)?(www\.youtube\.com|youtu\.?be)\/((watch\?v=)?([a-zA-Z0-9]{11}))(&.*)*$
https://www.youtube.com/watch?v=YPz9zqakRbk
https://www.youtube.com/watch?v=YPz9zqakRbk&t=11
http://youtu.be/cCnrX1w5luM&y=12
http://youtube/cCnrXswsluM
youtube/cCnrX1w5luM
Upvotes: 1
Reputation: 61
I took one of the answers from here and added support for a few edge cases that I noticed in my dataset. This should work for pretty much any valid url.
^(?:https?:)?(?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]{7,15})(?:[\?&][a-zA-Z0-9\_-]+=[a-zA-Z0-9\_-]+)*(?:[&\/\#].*)?$
Upvotes: 6
Reputation: 313
I know I'm like 2 years late to the party, but I was needing to write something up anyway, and seems to fit every test case that I can throw at it. Should be able to reference the first match ($1) to get the ID. Matches the http, https, www and non-www, youtube.com, youtu.be, /watch? and /watch.php? on youtube.com (youtu.be does not use these), and it supports matching even when there are other variables in the URL string (?t= for time, ?list= for playlists, etc).
(?:https?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]+)
Upvotes: 20
Reputation: 1052
I tried this one and it works fine for me.
(?:http(?:s)?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'<> #]+)
You can check here https://regex101.com/r/Kvk0nB/1
Upvotes: 1
Reputation: 707
Format for YouTube videos has changed. This regex works for all cases:
^(http(s)??\:\/\/)?(www\.)?((youtube\.com\/watch\?v=)|(youtu.be\/))([a-zA-Z0-9\-_])+
Tests here.
Upvotes: 12
Reputation: 3646
Based on so many other regex; this is the best I have got:
((http(s)?:\/\/)?)(www\.)?((youtube\.com\/)|(youtu.be\/))[\S]+
Test: http://regexr.com/3bga2
Upvotes: 5
Reputation: 82460
Try this:
((http://)?)(www\.)?((youtube\.com/)|(youtu\.be)|(youtube)).+
Upvotes: 3
Reputation: 126
Check this pattern instead:
r'(?i)(http.//|https.//)*[A-Za-z0-9._%+-]+\.\w+'
Upvotes: -5