Reputation: 7094
I'm working on a simple script to scrape the channel ID of a YouTube URL.
For example, to get the channel ID on this URL:
$url = 'https://youtube.com/channel/UCBLAoqCQyz6a0OvwXWzKZag';
I use regex:
preg_match( '/\/channel\/(([^\/])+?)$/', $url, $matches );
Works fine. But if the URL has any extra parameters or anything else after the channel ID, it doesn't work. Example:
https://youtube.com/channel/UCBLAoqCQyz6a0OvwXWzKZag?PARAMETER=HELLO
https://youtube.com/channel/UCBLAoqCQyz6a0OvwXWzKZag/RANDOMFOLDER
etc...
My question is, how can I adjust my regex so it works with those URLs? We don't want to match with the random parameters etc
Feel free to test my ideone code.
Upvotes: 2
Views: 398
Reputation: 626699
You can fix the regexps in the following way:
$preg_entities = [
'channel_id' => '\/channel\/([^\/?#]+)', //match YouTube channel ID from url
'user' => '\/user\/([^\/?#]+)', //match YouTube user from url
];
See the PHP demo.
With [^\/?#]+
patterns, the regex won't go through the query string in an URL, and you will get clear values in the output.
Full code snippet:
function getYouTubeXMLUrl( $url) {
$xml_youtube_url_base = 'h'.'ttps://youtube.com/feeds/videos.xml';
$preg_entities = [
'channel_id' => '\/channel\/([^\/?#]+)', //match YouTube channel ID from url
'user' => '\/user\/([^\/?#]+)', //match YouTube user from url
];
foreach ( $preg_entities as $key => $preg_entity ) {
if ( preg_match( '/' . $preg_entity . '/', $url, $matches ) ) {
if ( isset( $matches[1] ) ) {
return [
'rss' => $xml_youtube_url_base . '?' . $key . '=' . $matches[1],
'id' => $matches[1],
'type' => $key,
];
}
}
}
}
Test:
$url = 'https://youtube.com/channel/UCBLAoqCQyz6a0OvwXWzKZag?PARAMETER=HELLO';
print_r(getYouTubeXMLUrl($url));
// => Array( [rss] => https://youtube.com/feeds/videos.xml?channel_id=UCBLAoqCQyz6a0OvwXWzKZag [id] => UCBLAoqCQyz6a0OvwXWzKZag [type] => channel_id )
$url = 'https://youtube.com/channel/UCBLAoqCQyz6a0OvwXWzKZag/RANDOMFOLDER';
print_r(getYouTubeXMLUrl($url));
// => Array( [rss] => https://youtube.com/feeds/videos.xml?channel_id=UCBLAoqCQyz6a0OvwXWzKZag [id] => UCBLAoqCQyz6a0OvwXWzKZag [type] => channel_id )
Upvotes: 1