Reputation: 143
I have a string which has urls and I need to replace that urls with links but only if the links are in a white list of domains. I have a pattern which replaces the urls with links but I don't know how to put that list of accepted domains in the pattern. I use the following code:
$pattern = '/\b((http(s?):\/\/)|(?=www\.))(\S+)/is';
preg_replace($pattern,
'<a href="$1$4" target="_blank">$1$4</a>',
$string);
Upvotes: 0
Views: 150
Reputation: 3200
Before you do your REGEX stuff, you can just check to see if the domain appears in the whitelist.
<?php
$whitelist = array('http://www.google.com', 'http://www.yahoo.com');
$string = 'http://www.google.com';
if (in_array($string, $whitelist)) {
$pattern = '/\b((http(s?):\/\/)|(?=www\.))(\S+)/is';
$string = preg_replace($pattern, '<a href="$1$4" target="_blank">$1$4</a>', $string);
}
print $string;
EDIT:
So for this, I turned the string into an array and then looped through each part of that array. Then I checked to see if that array part matched any of the whitelist words. If so, then I plopped in your REGEX stuff; If not, it got left alone. Then I added each part back to an array, which I turned back into a string. I also applied CodeAngry's suggestion of using the ~
instead of /
for matching URLs.
<?php
$domain_array_new = array();
$whitelist = array('google.com', 'yahoo.com');
$string = 'subdomain.google.com Lorem yahoo.com Ipsum is simply microsoft.com dummy text www.google.com of the printing and typesetting industry.';
$domain_array = explode(' ', $string);
foreach ($domain_array AS $domain_part) {
foreach ($whitelist AS $whitelist_domain) {
if (preg_match('/'.preg_quote($whitelist_domain, '/').'/', $domain_part)) {
$pattern = '~\b((http(s?)://)|(?=www\.))(\S+)~is';
$domain_part = preg_replace($pattern, '<a href="$1$4" target="_blank">$1$4</a>', $domain_part);
}
}
$domain_array_new[] = $domain_part;
}
$string = implode(' ', $domain_array_new);
print $string;
Now, this works somewhat, but you need to do some more work on your regular expression. The only URL that it picked up was www.google.com
. It did not pick up yahoo.com
or subdomain.google.com
because those do not have an http(s)?
or www
in front of them.
EDIT #2:
I played around with this a little bit more and came up with an easier method of doing a find replace instead of breaking it up into an array, processing it and then turning it back into a string.
// YOUR WHITELIST ARRAY
$whitelist = array('google.com', 'yahoo.com', 'microsoft.com');
// TURN YOUR ARRAY INTO AN "OR" STRING TO BE USED FOR MATCHING
$whitelist_matching_string = implode('|', $whitelist);
// DO AN INLINE FIND/REPLACE
$string = preg_replace('~((http(s)?://)?(([-A-Z0-9.]+)?('.$whitelist_matching_string.')(\S+)?))~i', '<a href="http://$4">$1</a>', $string);
print $string;
Let me know if this works better for you.
Upvotes: 1