Kimomaru
Kimomaru

Reputation: 1023

Excluding a string in a preg_match

I have two preg_match statements that work find by themselves, but when I use both of them the second one breaks the first one;

$message = preg_replace("!(http(s)?://(www\.|m\.)?(youtu\.be/|youtube\.com/watch\?v=)([-|~_0-9A-Za-z]+))!", 
"<p><a href= '/viewpost.php?messageid=$message_id'><img src='http://img.youtube.com/vi/$5/0.jpg' width='536' border='1'></a><p>", $message);

$message = preg_replace("!(((f|ht)tp(s)?://)[-a-zA-Z?-??-?()0-9@:%_+.~#?&;//=]+)!i", "<a href='$1' target='_blank' STYLE='TEXT-DECORATION: NONE'><b>$1</b></a>", 
$message);

I've posted this recently and received a recommendation of setting up arrays for the patterns and conditions, then applying them together in a single preg_match statement, but this does not seem to address the issue (although the code looked nicer).

My idea is to set up a pattern in the second statement that will exclude a pattern found in the first statement (in this case, "youtube.com/" and "youtu.be/") but I can't get it to work;

$message = preg_replace("!(http(s)?://(www\.|m\.)?(youtu\.be/|youtube\.com/watch\?v=)([-|~_0-9A-Za-z]+))!", 
"<p><a href= '/viewpost.php?messageid=$message_id'><img src='http://img.youtube.com/vi/$5/0.jpg' width='536' border='1'></a><p>", $message);

$message = preg_replace("!(((f|ht)tp(s)?://)/^(?\!\youtube\.|youtu\.be/)[-a-zA-Z?-??-?()0-9@:%_+.~#?&;//=]+)!i", "<a href='$1' target='_blank' STYLE='TEXT-DECORATION: NONE'><b>$1</b></a>", 
$message);

I think I'm close to finding a way, but I'm not catching something.

Upvotes: 1

Views: 501

Answers (1)

Peter van der Wal
Peter van der Wal

Reputation: 11856

You're almost there. In your second preg_replace you have a few errors.

  1. Remove the /^ after the http-part (that slash isn't there and ^ matches only the start of the string)
  2. The regex doesn't compile because the right syntax to exclude is ?! (meaning not followed by) and that doesn't work well with escaping. I've chosen # as open- and close-mark.
  3. Because you've already replaced youtu.be/... to img.youtube.com/... you should exclude that url instead.

Working code:

$message = 'http://youtu.be/watch?v=123 http://www.google.com';
$message_id = 6;

$message = preg_replace("!(http(s)?://(www\.|m\.)?(youtu\.be/|youtube\.com/watch\?v=)([-|~_0-9A-Za-z]+))!", 
"<p><a href= '/viewpost.php?messageid=$message_id'><img src='http://img.youtube.com/vi/$5/0.jpg' width='536' border='1'></a><p>", $message);

$message = preg_replace("#(((f|ht)tp(s)?://)(?!img.youtube.com/vi/)[-a-zA-Z?-??-?()0-9@:%_+.~\#?&;//=]+)#i", "<a href='$1' target='_blank' STYLE='TEXT-DECORATION: NONE'><b>$1</b></a>", 
$message);

echo htmlspecialchars($message);

Besides that, a solution with preg_replace_callback might be easier to read and much easier to extend with future options (for example inline images if url ends with .jpg):

$message = 'http://youtu.be/watch?v=123 http://www.google.com';
$message_id=6;

$callback = function($matches) use ($message_id) {
    $youtube = preg_replace("!(http(s)?://(www\.|m\.)?(youtu\.be/|youtube\.com/watch\?v=)([-|~_0-9A-Za-z]+))!", 
"<p><a href= '/viewpost.php?messageid=$message_id'><img src='http://img.youtube.com/vi/$5/0.jpg' width='536' border='1'></a><p>", $matches[0], -1, $count);
    if ($count) {
        return $youtube;
    } else {
        return "<a href='".$matches[0]."' target='_blank' STYLE='TEXT-DECORATION: NONE'><b>".$matches[0]."</b></a>";
    }
};

$message = preg_replace_callback("!(((f|ht)tp(s)?://)[-a-zA-Z?-??-?()0-9@:%_+.~#?&;//=]+)!i", $callback, $message);

echo htmlspecialchars($message);

(note that the use of a Closure $callback = function() { ... } requires at least PHP 5.3, otherwise you can use a named function)

Upvotes: 1

Related Questions