Chad
Chad

Reputation: 714

PHP Regex not matching string as expected

The Regex:

https?://([a-zA-Z0-9-_]{1,50}[.])*[a-zA-z0-9-_]{1,50}[.]([(org)(gov)(com)]{3}|[(us)(fi)]{2})

The Tester:

http://regex.powertoy.org/

The Code:

if(preg_match_all('|https?://([a-zA-Z0-9-_]{1,50}[.])*[a-zA-z0-9-_]{1,50}[.]([(org)(gov)(com)]{3}|[(us)(fi)]{2})|',$row['text'],$links))
    {
        print_r($links[0]);
        /*for($x=0;$x<count(links[0]);$x++)
        {
            $row['text'] = str_replace($links[0][$x], 'link' . $link[0][$x] . 'link', $row['text'];
        }*/
    }else{
        echo 'Failure!';
    }

The regex matches URLs in the tester fine, but not at all in an HTML/PHP front end. I'm not sure what the problem is. The point of the regex/code is basically to match URLs regardless of the number of subdomains.

Upvotes: 0

Views: 84

Answers (2)

jeroen
jeroen

Reputation: 91734

You are using the | character as your delimiter but you are also using it in your regex.

I would recommend using another character and making the regex case-insensitive to avoid problems like where you have for example a-zA-z:

preg_match_all('#https?://([a-zA-Z0-9-_]{1,50}[.])*[a-zA-z0-9-_]{1,50}[.]([(org)(gov)(com)]{3}|[(us)(fi)]{2})#i',$row['text'],$links)

Upvotes: 2

Ωmega
Ωmega

Reputation: 43673

Fix of your regex pattern is:

https?:\/\/(?:[\w-]{1,50}\.)*[\w-]{1,50}\.(?:org|gov|com|us|fi)

But I recommend to use:

https?:\/\/(?:[a-zA-Z\d]+(?:\-[a-zA-Z\d]+)*\.)+(?:org|gov|com|us|fi) 

Upvotes: 2

Related Questions