Reputation: 13522

I need a regex that will pull a URL from a text document

The urls I'm trying to pull are all in the format of www.domain.com. I want to pull them from text documents with a simple regex. It only needs to match www.domain.com, and not other url variations.

What is the simplest regex to use with preg_match_all()?

Upvotes: 0

Answers (3)

Greg

Reputation: 3522

I don't do a whole lot with PHP, but the regex would be something like:

w{3}.([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?

will return all domain names that start with "www.". It will ignore the protocol part of the tag (e.g. http://)

Upvotes: 1

Teneff

Reputation: 32158

/w{3}\.\w{2,}\.\w{3}/

this will match www. any word with more than two letters dot + 3 letters

to match domains with hyphen or uppercase letters:

/w{3}\.[\w\-]{2,}\.\w{3}/i

Upvotes: 2

Homer6

Reputation: 15159

preg_match_all('%((mailto\\:|(news|(ht|f)tp(s?))\\://){1}\\S+)%m', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
    // $result[0][$i];
}

You can also use a class that I wrote, https://github.com/homer6/altumo/blob/master/source/php/String/Url.php if you want to easily pull parts of the url. See the unit test in the same directory for usage.

If you're looking for a good program to tweak your regex patterns, I highly recommend regexbuddy.

Hope that helps...

Upvotes: 0

I need a regex that will pull a URL from a text document

Answers (3)

Related Questions