Jonan
Jonan

Reputation: 2536

Change the src part of an img tag

I've got a string containing html code, and I want to change <img src="anything.jpg"> to <img src="'.DOC_ROOT .'anything.jpg"> everytime it occurs in the string. I really don't want to use an html parser, since this will be the only thing I'll be using it for. Does anyone know how to do this in php, using a regex for example?

Upvotes: 1

Views: 5773

Answers (4)

brandonscript
brandonscript

Reputation: 73044

If you absolutely have to use regular expressions instead of a DOM parser, you could use this.

Not sure where DOC_ROOT is coming from though, since it's not a valid PHP variable (maybe a constant?). Also be aware that you won't be able to use an embedded variable inside the string if you have single quotes.

You probably want something more like:

img.*?src=['"](.*?)['"]

Replacing with:

img src="$_SERVER['DOCUMENT_ROOT']$1"

Which converts:

echo "<img src='anything.jpg'>"; //into:
echo "<img src='$_SERVER[\'DOCUMENT_ROOT\']/anything.jpg'>";

http://regex101.com/r/vN7lN9

In php, the code would look like this:

$string = "<img src='anything.jpg'>";
echo preg_replace('/img.*?src=[\'\"](.*?)[\'\"]/', "img src='$_SERVER[DOCUMENT_ROOT]/$1'", $string);

Be warned that if your DOM contains irregular HTML (a tag misplaced here and there, spaces between the = sign) you're liable to end up causing a lot of problems. That's where a DOM parser like comes in handy.

Upvotes: 5

mickmackusa
mickmackusa

Reputation: 48069

A lot of people state the importance of using a DOM parser, but too few answers actually demonstrate how to execute the task.

Regex, even when tempting to write a one-liner or to change a single character, is unsuitable for parsing html because it is DOM-ignorant -- it treats your input as a string and nothing more. I've crafted a demonstration of how regex (from the accepted answer) will make unintended replacements.

Code: (Demo)

$html = <<<HTML
<p>Some random text <img src="anything.jpg"> text <iframe data-whoops="<img" src="anything.jpg"></iframe></p>
HTML;

define('DOC_ROOT', 'www.example.com/');

echo "With regex:\n";
echo preg_replace('/<img([^>]*)src=["\']([^"\'\\/][^"\']*)["\']/', '<img\1src="'.DOC_ROOT.'\2"', $html);

echo "\n\n---\n\nWith a parser:\n";

$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
foreach ($dom->getElementsByTagName('img') as $img) {
    $img->setAttribute('src', DOC_ROOT . $img->getAttribute('src'));
}
echo $dom->saveHTML();

Output:

With regex:
<p>Some random text <img src="www.example.com/anything.jpg"> text <iframe data-whoops="<img" src="www.example.com/anything.jpg"></iframe></p>

---

With a parser:
<p>Some random text <img src="www.example.com/anything.jpg"> text <iframe data-whoops="&lt;img" src="anything.jpg"></iframe></p>

If you need to make conditional replacements on an img tag's url, there are additional tools like a url parser or Xpath that can be implemented to serve your requirements.

https://stackoverflow.com/a/60263813/2943403

Ultimately, my advice is to forget about how many lines of code you write; just write robust/reliable code.

Upvotes: 1

Joeytje50
Joeytje50

Reputation: 19122

You really should use a parser but since you made clear that you really don't want to do that, you can use the following regex replace:

$string = preg_replace('/<img([^>]*)src=["\']([^"\'\\/][^"\']*)["\']/', '<img\1src="'.DOC_ROOT.'\2"', $string);

Demo. This regular expression will not modify any urls that are already a relative path. Change it to the following if you do want to match those:

$string = preg_replace('/<img([^>]*)src=["\']["\'\\/]?([^"\']*)["\']/', '<img\1src="'.DOC_ROOT.'\2"', $string);

Demo.

Upvotes: 4

dincan
dincan

Reputation: 99

That's what you are looking for, i think:

$pictureName = 'anything.jpg';

$html = str_replace($pictureName, DOC_ROOT.$pictureName, $html);

Upvotes: -1

Related Questions