Neo
Neo

Reputation: 395

Check if regex pattern is correct

I just made a regex pattern for replace links to HTML anchor tags, this is it:

~((http\:\/\/|https\:\/\/)([^ ]+)) ~

The reason why I ask this, is because I just finished this regex recently and made a few tests with some links, it works great but I want to be sure that there is no bugs with this pattern (I'm a regex newie) and maybe a regex expert could tell his opinion and / or suggestion.

By the way, if you're figuring out the space at the end, you may think it will not work if the string doesn't ends with a space, but my trick is to add that space to the string before the replacement and then remove it again once the stuff is done.

PD:

I don't take care of the link's validation itself, I just want to search for the strings that starts with http:// and ends with a space, nothing else, since link validation is a bit complicated.

EDIT:

Some of my code:

<?php

    $patron = "~(https?:\/\/[^\s]+) ~";
    //$patron = "~((http\:\/\/|https\:\/\/)([^ ]+)) ~";
    $reemplazar = '<a href="$1">$1</a> ';
    $cadena = "https://www.youtube.com/watch?v=7it5wioGixA ";

    echo preg_replace($patron, $reemplazar, $cadena);

?>

Upvotes: 0

Views: 122

Answers (2)

DaSourcerer
DaSourcerer

Reputation: 6606

I think this can be greatly simplified:

~(https?://\S+) ~

Other than that: Looks okay to me.

Upvotes: 2

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89574

With the same idea, your pattern can be shorten to :

~https?://[^\s"'>]+~    # don't forget to escape the quote you use.

To change URLs to links:

$html = preg_replace_callback('~\b(?:(https?://)|www\.)[^]\s"\')<]++~',
    function ($m) {
        $pre = ($m[1]) ? $m[1] : 'http://'; 
        if (filter_var($pre . $m[0], FILTER_VALIDATE_URL))
            return '<a href="' . $m[0] . '">' . $m[0] . '</a>';
        else return $m[0];
    }, $html);

Old answer:

To change URLs inside links:

A better way to extract all href attributes from all "a" tags is to use the DOM.

$doc = new DOMDocument();
@$doc->loadHTML($htmlString);
$links = $doc->getElementsByTagName('href');
foreach($links as &$link) {
    $href = $link->getAttribute('href');
    $link->setAttribute('href', 'what you want');
}

Upvotes: 1

Related Questions