Reputation: 8375
I almost have my regular expression down for skimming html pages, but have ran into two issues that I am trying to get squished before I an proceed, I need to be able to match both empty and slash (and empty closing quote) but have exhausted my ability to see what I'm doing, could someone help me with the final bit?
$pathspec='in-front';
$subjects = array(
'<base href="http://foo.com/images/" target="_blank">', # no changes (correct)
'<base href="/" target="_blank">', # '/in-front/' (fails)
'<a href="https://foo.com/images/">Foo</a>', # no changes (correct)
'<a href="">Foo</a>', # '/in-front/' (fails)
'<img src="bar/foo.png" />', # no changes (correct)
'<img src="/bar/foo.png" />', # '/in-front/bar/foo.png' (correct)
);
foreach ($subjects AS $subject)
echo preg_replace( '/(href|src)=["\']?\/(?!\/)([^"\'>]+)["\']?/', "$1='/$pathspec/$2'", $subject ) . "\n";
die;
Expected output is in the comments portion, Thank you.
Upvotes: 2
Views: 104
Reputation: 89557
You can use this pattern:
$pattern = '~\b(?:href|src)\s*=\s*(["\']?+)\K(?:/|(?=[\s>]|\1))~i';
$replacement = "/$pathspec/";
$result = preg_replace($pattern, $replacement, $subject);
Upvotes: 1
Reputation: 16045
See if this works for you
preg_replace('#(href|src)=["\'](?:/|/(?!\/)(\S+?)|)["\']#',"$1='/$pathspec/$2'",$subject)
Upvotes: 2