Mike
Mike

Reputation: 51

Regular Expression for URL

The regular expression posted below is used to pick up URLs, including ones in the format such as example.com. However, I want it only to pick up on URLs that have a www. or http, https, etc. in the front. In other words, it should pick up www.example.com. It should not pick up example.com.

((((ht|f)tp(s?))\://)?((www.|[a-zA-Z])([a-zA-Z0-9\-]+\.)([a-zA-Z]{2,8}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))*)

Upvotes: 2

Views: 1410

Answers (4)

Alix Axel
Alix Axel

Reputation: 154543

Here you go:

\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.])(?:[^\s()<>]+|\([^\s()<>]+\))+(?:\([^\s()<>]+\)|[^`!()\[\]{};:'".,<>?«»“”‘’\s]))

It's the revised Liberal URL Regex from Daring Fireball.

Upvotes: 0

Hmmm try

(((((ht|f)tp(s?))\://)|(www\.))((|[a-zA-Z])([a-zA-Z0-9-]+.)([a-zA-Z]{2,8}))(\:[0-9]+)*(/($|[a-zA-Z0-9.\,\;\?\'\+&%\$#\=~_-]+))*)

EDIT: Yeah, I didn't really test that one. Ok, I didn't test this one either but I looked at it REALLY carefully ;)

(((((ht|f)tp(s?))\://)|(www\.))(([a-zA-Z0-9-]+.)?([a-zA-Z0-9]+\.)([a-zA-Z]{2,8}))(\:[0-9]+)*(/($|[a-zA-Z0-9.\,\;\?\'\+&%\$#\=~_-]+))*)

You should look into a good regex tester. I usually use Expresso but there are many others out there.

Upvotes: 1

Wayne Conrad
Wayne Conrad

Reputation: 107989

Validate that the URI is well-formed with a regexp--use the one out of RFC 3986. Validate that it is plausible with code. Trying to combine the check for well-formed and plausible into one regexp is too difficult to get right. See: Need a regex to validating a Url...

Upvotes: 1

Philipp Grathwohl
Philipp Grathwohl

Reputation: 2836

I modified your expression:

((((ht|f)tp(s?))\://)?((www\.)([a-zA-Z0-9-]+\.)([a-zA-Z]{2,8}))(\:[0-9]+)*(/($|[a-zA-Z0-9.\,\;\?\'\+&%\$#\=~_-]+))*)

A pretty good website to check your expressions here: http://gskinner.com/RegExr/

Upvotes: 0

Related Questions