Reputation: 1112
I have a input area where people post updates. So I want to filter youtube links, modify them and append them in the end.
This content is not html, it even does not have <br>
or <p>
, it's just pure string.
Here is the code I've got from different part of program.
What this should do is, take all matches, and replace them with html.
function aKaFilter( $content ) {
global $bp;
$pattern2 = '#^(?:https?://)?(?:www\.)?(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})(?:.+)?$#x';
preg_match_all( $pattern2, $content, $youtubes );
if ( $youtubes ) {
/* Make sure there's only one instance of each video */
if ( !$youtubes = array_unique( $youtubes[1] ) )
return $content;
//but we need to watch for edits and if something was already wrapped in html link - thus check for space or word boundary prior
foreach( (array)$youtubes as $youtube ) {
$pattern = "NEW". $youtube ."PATTERN TO MATCH THIS LINK";
$content = preg_replace( $pattern, '<span class="video youtube" data-trigger="'.$youtube.'"><img src="http://img.youtube.com/vi/'.$youtube.'/0.jpg"><span class="icon-stack"><i class="icon-circle icon-stack-base"></i><i class="icon-youtube-play"></i></span><span>title</span></span>', $content );
}
}
return $content;
}
here is a original code:
function etivite_bp_activity_hashtags_filter( $content ) {
global $bp;
//what are we doing here? - same at atme mentions
//$pattern = '/[#]([_0-9a-zA-Z-]+)/';
$pattern = '/(?(?<!color: )(?<!color: )[#]([_0-9a-zA-Z-]+)|(^|\s|\b)[#]([_0-9a-zA-Z-]+))/';
preg_match_all( $pattern, $content, $hashtags );
if ( $hashtags ) {
/* Make sure there's only one instance of each tag */
if ( !$hashtags = array_unique( $hashtags[1] ) )
return $content;
//but we need to watch for edits and if something was already wrapped in html link - thus check for space or word boundary prior
foreach( (array)$hashtags as $hashtag ) {
$pattern = "/(^|\s|\b)#". $hashtag ."($|\b)/";
$content = preg_replace( $pattern, ' <a href="' . $bp->root_domain . "/" . $bp->activity->slug . "/". BP_ACTIVITY_HASHTAGS_SLUG ."/" . htmlspecialchars( $hashtag ) . '" rel="nofollow" class="hashtag">#'. htmlspecialchars( $hashtag ) .'</a>', $content );
}
}
return $content;
}
what it does is, it takes textarea, and instead of #hash it replaces with <a>#hash</a>
hashtags like you see in social media.
what I want my function to do, is to take youtube links and convert it to <a>ID</a>
(basically)
It works fine If I have only youtube link, but when it's with string after or before it, it just goes crazy.
I guess it does not work because I didn't came up with second $pattern. which was there in other program.
Upvotes: 0
Views: 1047
Reputation: 134
try using url :
result in JSON format. http://gdata.youtube.com/feeds/mobile/videos?alt=json&q=music&format=1,5,6
result in xml format http://gdata.youtube.com/feeds/mobile/videos?q=music&format=1,5,6
Then, for xml format use regular expression on -- tag:youtube.com,2008:video:qycqF1CWcXg and retrieve video ID i.e. "qycqF1CWcXg" in this example
Same steps applicable for JSON format.
Upvotes: 0
Reputation: 2551
The problem when trying to match URLs using regexes withing a text is that you can't know when the URL ends.
URLs can contain 'spaces', .
, ,
and other characters, so you can't say that the URL ends when a new word begins or when a sentence ends. Besides, the end of your regex (?:.+)?
will match (almost) everything.
If you make the assumption that a yutube URL can not contain white spaces (after a given position/index of the URL), you can change the end of your regex by (?:[^\s]+)?
(all but white spaces), you can add other characters to the set in order to define the end of your URL, for example if the URL must not contain ,
either, you do (?:[^\s,]+)?
, and so on.
Then, you set beginning and ending anchors on your regex (^
and $
). That may not work when your URL is surrounded by some text, so you can remove those anchors and add the \b
(word boundary) anchor at the beginning of your regex.
By the way, you can replace (?:.+)?
by .*
and (?:[^\s,]+)?
by `[^\s,]*
You now have a regex like that : '#\b(?:https?://)?(?:www\.)?(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})[^\s,]*#x'
NB. I did not analyze all the logic of your regex, so my comments only worth for the beginning and ending of your regex.
Upvotes: 1
Reputation: 1976
Why do you need preg_replace()? str_replace() in your case should suffice. Also you probably need to iterate over $youtubes[0], not $youtubes. Plus simplify your code! ;-)
Ergo this should work:
function aKaFilter( $content ) {
global $bp;
$pattern2 = '#^(?:https?://)?(?:www\.)?(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})(?:.+)?$#x';
preg_match_all( $pattern2, $content, $youtubes );
/* Make sure there's only one instance of each video */
$youtubes = array_unique( $youtubes[1] );
if ( $youtubes ) {
//but we need to watch for edits and if something was already wrapped in html link - thus check for space or word boundary prior
foreach( $youtubes[0] as $youtube ) {
$content = str_replace( $youtube, '<span class="video youtube" data-trigger="'.$youtube.'"><img src="http://img.youtube.com/vi/'.$youtube.'/0.jpg"><span class="icon-stack"><i class="icon-circle icon-stack-base"></i><i class="icon-youtube-play"></i></span><span>title</span></span>', $content );
}
}
return $content;
}
Upvotes: 1
Reputation: 10000
Don't use a regex for this at all, use parse_url
.
For instance:
$parsed_url = parse_url($content);
if (in_array($parsed_url['host'], array('www.youtube.com', 'youtube.com', 'www.youtube-nocookie.com', 'youtube-nocookie.com'))) {
## Now look through $parsed_url['query'] for the video ID
## Parsing this out is a separate question :)
}
Upvotes: 1