Reputation:
I have a text that contains a YouTube URL. I need to remove all portions of the link, except for the YouTube video code. The URL may be surrounded by blank space or nothing; no non-blank characters will adjoin the URL.
SAMPLE:
$txt = "This text contain this link: https://www.youtube.com/watch?v=b8ri14rw32c&rel=0 and so on..."
EXTRACTING ID:
$pattern = '#(?<=v=|v\/|vi=|vi\/|youtu.be\/)[a-zA-Z0-9_-]{11}#';
preg_match_all($pattern, $txt, $matches);
print_r($matches);
EXPECTED:
Array
(
[0] = "This text contain this link b8ri14rw32c and so on..."
)
Upvotes: 1
Views: 846
Reputation: 8042
You can try this pattern to match:
https:\/\/(?:www.)?youtu(?:be\.com|\.be)\/(?:watch\?vi?[=\/])?(\w{11})(?:&\w+=[^&\s]*)*
There is exactly one capture in this expression, and it's for the YouTube video code. This capture can be used with a regex replace to replace the entire link text with just the captured video code.
This regex will work with these format YouTube URLs:
https://www.youtube.com/watch?v=b8ri14rw32c&rel=0
https://youtu.be/Rk_sAHh9s08
Other YouTube URL formats have not been tested, but could easily be supported if needed.
This PHP code will test this regexp replacement using preg_replace
:
$txt = "This text contain this link: https://www.youtube.com/watch?v=b8ri14rw32c&rel=0 and so on...";
$pattern = "/https:\/\/(?:www.)?youtu(?:be\.com|\.be)\/(?:watch\?vi?[=\/])?(\w{11})(?:&\w+=[^&\s]*)*/";
$text = preg_replace($pattern, '$1', $txt);
Upvotes: 2
Reputation: 1640
If I understood you correctly, the following should work for normal YouTube links (unshortened).
https?:\/\/[^\s]+[?&]v=([^&\s]+)[^\s]*
Replace with \1
(Capturing group 1)
Upvotes: 2