Reputation: 2536
I've got a string containing html code, and I want to change <img src="anything.jpg">
to <img src="'.DOC_ROOT .'anything.jpg">
everytime it occurs in the string. I really don't want to use an html parser, since this will be the only thing I'll be using it for. Does anyone know how to do this in php, using a regex for example?
Upvotes: 1
Views: 5773
Reputation: 73044
If you absolutely have to use regular expressions instead of a DOM parser, you could use this.
Not sure where DOC_ROOT is coming from though, since it's not a valid PHP variable (maybe a constant?). Also be aware that you won't be able to use an embedded variable inside the string if you have single quotes.
You probably want something more like:
img.*?src=['"](.*?)['"]
Replacing with:
img src="$_SERVER['DOCUMENT_ROOT']$1"
Which converts:
echo "<img src='anything.jpg'>"; //into:
echo "<img src='$_SERVER[\'DOCUMENT_ROOT\']/anything.jpg'>";
In php, the code would look like this:
$string = "<img src='anything.jpg'>";
echo preg_replace('/img.*?src=[\'\"](.*?)[\'\"]/', "img src='$_SERVER[DOCUMENT_ROOT]/$1'", $string);
Be warned that if your DOM contains irregular HTML (a tag misplaced here and there, spaces between the =
sign) you're liable to end up causing a lot of problems. That's where a DOM parser like domdocument comes in handy.
Upvotes: 5
Reputation: 48069
A lot of people state the importance of using a DOM parser, but too few answers actually demonstrate how to execute the task.
Regex, even when tempting to write a one-liner or to change a single character, is unsuitable for parsing html because it is DOM-ignorant -- it treats your input as a string and nothing more. I've crafted a demonstration of how regex (from the accepted answer) will make unintended replacements.
Code: (Demo)
$html = <<<HTML
<p>Some random text <img src="anything.jpg"> text <iframe data-whoops="<img" src="anything.jpg"></iframe></p>
HTML;
define('DOC_ROOT', 'www.example.com/');
echo "With regex:\n";
echo preg_replace('/<img([^>]*)src=["\']([^"\'\\/][^"\']*)["\']/', '<img\1src="'.DOC_ROOT.'\2"', $html);
echo "\n\n---\n\nWith a parser:\n";
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
foreach ($dom->getElementsByTagName('img') as $img) {
$img->setAttribute('src', DOC_ROOT . $img->getAttribute('src'));
}
echo $dom->saveHTML();
Output:
With regex:
<p>Some random text <img src="www.example.com/anything.jpg"> text <iframe data-whoops="<img" src="www.example.com/anything.jpg"></iframe></p>
---
With a parser:
<p>Some random text <img src="www.example.com/anything.jpg"> text <iframe data-whoops="<img" src="anything.jpg"></iframe></p>
If you need to make conditional replacements on an img tag's url, there are additional tools like a url parser or Xpath that can be implemented to serve your requirements.
https://stackoverflow.com/a/60263813/2943403
Ultimately, my advice is to forget about how many lines of code you write; just write robust/reliable code.
Upvotes: 1
Reputation: 19122
You really should use a parser but since you made clear that you really don't want to do that, you can use the following regex replace:
$string = preg_replace('/<img([^>]*)src=["\']([^"\'\\/][^"\']*)["\']/', '<img\1src="'.DOC_ROOT.'\2"', $string);
Demo. This regular expression will not modify any urls that are already a relative path. Change it to the following if you do want to match those:
$string = preg_replace('/<img([^>]*)src=["\']["\'\\/]?([^"\']*)["\']/', '<img\1src="'.DOC_ROOT.'\2"', $string);
Demo.
Upvotes: 4
Reputation: 99
That's what you are looking for, i think:
$pictureName = 'anything.jpg';
$html = str_replace($pictureName, DOC_ROOT.$pictureName, $html);
Upvotes: -1