Reputation: 13
I need to be able to parse some text and find all the instances where an tag has target="_blank".... and for each match, add (for example): This link opens in a new window before the closeing tag.
For example:
Before:
<a href="http://any-website-on-the-internet-or-local-path" target="_blank">Go here now</a>
After:
<a href="http://any-website-on-the-internet-or-local-path" target="_blank">Go here now<span>(This link opens in a new window)</span></a>
This is for a PHP site, so i assume preg_replace() will be the method... i just dont have the skills to write the regex properly.
Thanks in advance for any help anyone can offer.
Upvotes: 1
Views: 3114
Reputation: 6527
Well @Kolink is right, but there's my RegExp version.
$string = '<p>mess</p><a href="http://google.com.br/" target="_blank" class=\"bother\">Google</a><p>mess</p>';
echo preg_replace("/(\<a.*?target=\"_blank\".*?>)(.*?)(\<\/a\>)/miU","$1$2(This link opens in a new window)$3",$string);
Upvotes: 0
Reputation: 89557
This does the job:
$newText = '<span>(This link opens in a new window)</span>';
$pattern = '~<a\s[^>]*?\btarget\s*=(?:\s*([\'"])_blank\1|_blank\b)[^>]*>[^<]*(?:<(?!/a>)[^<]*)*\K~i';
echo preg_replace($pattern, $newText, $html);
However this direct string approach may replace also commented html parts, strings or comments in css or javascript code and eventually inside javascript literal regexes, that is at best unneeded and at worst unwanted at all. That's why you should use a DOM approach if you want to avoid these pitfalls. All you have to do is to append a new node to each link with the desired attribute:
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$nodeList = $xp->query('//a[@target="_blank"]');
foreach($nodeList as $node) {
$newNode = dom->createElement('span', '(This link opens in a new window)');
$node->appendChild($newNode);
}
$html = $dom->saveHTML();
To finish, a last alternative consists to not change the html at all and to play with css:
a[target="_blank"]::after {
content: " (This link opens in a new window)";
font-style: italic;
color: red;
}
Upvotes: -1
Reputation: 510
This will work for anchor tag replacement....
$string = str_replace('<a ','<a target="_blank" ',$string);
Upvotes: 1
Reputation: 27
You won't be able to write a regex that will evaluate an infinitely long string. I suggest:
$h = explode('>', $html);
This will give you the chance to traverse it like any other array and then do:
foreach($h as $k){
if(!preg_match('/^<a href=/', $k){
continue;
}elseif(!preg_match(/target="_blank")/, $k){
continue;
}else{
$h[$k + 1] .= '(open in new window);
}
}
$html = implode('>', $h);
This is how I would approach such a problem. of course, I just threw this out off the top of my head and is note guaranteed to work as is, but with a few possible tweaks to your exact logic, and you will have what you need.
Upvotes: -2
Reputation: 324650
You should never use a regex to parse HTML, except maybe in extremely well-defined and controlled circumstances.
Instead, try a built-in parser:
$dom = new DOMDocument();
$dom->loadHTML($your_html_source);
$xpath = new DOMXPath($dom);
$links = $xpath->query("//a[@target='_blank']");
foreach($links as $link) {
$link->appendChild($dom->createTextNode(" (This link opens in a new window)"));
}
$output = $dom->saveHTML();
Aternatively, if this is being output to the browser, you can just use CSS:
a[target='_blank']:after {
content: ' (This link opens in a new window)';
}
Upvotes: 8