Reputation: 5969

How to match bare urls with regex in PHP?

$bits = preg_split('#((?:https?|ftp)://[^\s\'"<>()]+)#S', $token->data, -1, PREG_SPLIT_DELIM_CAPTURE);

Say,I'm trying to match urls that need to be linkified.The above is too permissive.

I want to only match simple urls like http://google.com, but not <a href="http://google.com">http://google.com</a>, or <iframe src="http://google.com"></iframe>

Upvotes: 1

Answers (4)

Amit Kumar Gupta

Reputation: 7413

More effective RE

[hf]t{1,2}p:\/\/[a-zA-Z0-9\.\-]*

Result

Array
(
    [0] => Array
        (
            [0] => ftp://article-stack.com
            [1] => http://google.com
        )
)

Upvotes: 0

Amit Kumar Gupta

Reputation: 7413

http:\/\/[a-zA-Z0-9\.\-]*

Result

Array
(
    [0] => http://google.com
)

Upvotes: 0

jatt

Reputation: 398

try this...

function validUrl($url){
        $return=FALSE;
        $matches=FALSE;
        $regex='#(^';                  #match[1]
        $regex.='((https?|ftps?)+://)?'; #Scheme match[2]
        $regex.='(([0-9a-z-]+\.)+'; #Domain match[5] complete match[4]
        $regex.='([a-z]{2,3}|aero|coop|jobs|mobi|museum|name|travel))'; #TLD match[6]
        $regex.='(:[0-9]{1,5})?'; #Port match[7]
        $regex.='(\/[^ ]*)?'; #Query match[8]
        $regex.='$)#i';
        if( preg_match($regex,$url,$matches) ){
            $return=$matches[0]; $domain=$matches[4];
            if(!gethostbyname($domain)){ 
                $return = FALSE;
            }
        }
        if($return==FALSE){
            return FALSE;
        }
        else{
            return $matches;
        }
    }

Upvotes: 0

Nick Bastin

Reputation: 31339

It appears that you're trying to parse HTML using regular expressions. You might want to rethink that.

Upvotes: 2

How to match bare urls with regex in PHP?

Answers (4)

Related Questions