Reputation: 13522
The urls I'm trying to pull are all in the format of www.domain.com. I want to pull them from text documents with a simple regex. It only needs to match www.domain.com, and not other url variations.
What is the simplest regex to use with preg_match_all()?
Upvotes: 0
Views: 114
Reputation: 3522
I don't do a whole lot with PHP, but the regex would be something like:
w{3}.([a-zA-Z0-9\~\!\@\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?
will return all domain names that start with "www.". It will ignore the protocol part of the tag (e.g. http://
)
Upvotes: 1
Reputation: 32158
/w{3}\.\w{2,}\.\w{3}/
this will match www.
any word with more than two letters dot
+ 3 letters
to match domains with hyphen or uppercase letters:
/w{3}\.[\w\-]{2,}\.\w{3}/i
Upvotes: 2
Reputation: 15159
preg_match_all('%((mailto\\:|(news|(ht|f)tp(s?))\\://){1}\\S+)%m', $subject, $result, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($result[0]); $i++) {
// $result[0][$i];
}
You can also use a class that I wrote, https://github.com/homer6/altumo/blob/master/source/php/String/Url.php if you want to easily pull parts of the url. See the unit test in the same directory for usage.
If you're looking for a good program to tweak your regex patterns, I highly recommend regexbuddy.
Hope that helps...
Upvotes: 0