I-M-JM
I-M-JM

Reputation: 15950

Replace the src attribute value of all <img> tags in an HTML document

I have following code (php), it will match img-src and replace with new url

$rep = array('/', '+', '(', ')');
$with = array('\/', '\+', '\(', '\)');

$match_pattern = '/<img[^<]*src\s*=\s*\"'.str_replace($rep, $with, $source_url).'\"[^>]*>/iUu';
$img_replace_str = '<img src="'.$new_url.'" />';
$post_content = preg_replace($match_pattern, $img_replace_str, $post_content);

For images that have src as http://www.example.com/a.jpg, there is no issue, but for images that have src that contains query string like http://www.example.com/b.jpg?height=900, it's not matching.

I want to match image tags with and without a query string.

Upvotes: 0

Views: 173

Answers (2)

mickmackusa
mickmackusa

Reputation: 47894

Use a legitimate DOM parser to easily and intuitively replace src attribute values of <img> tags containing any manner of attributes. XPath does an exquisitely direct job of targeting the src attribute of <img> tags ONLY.

Code: (Demo)

$html = <<<HTML
<div>
Here is an img tag with no qs <img src="http://www.example.com/a.jpg">,
 an img with no src <img title="to be determined">,
 and here is another with a qs <img src="http://www.example.com/b.jpg?height=900">.
Here is a <iframe src="http://www.example.com/c.jpg?foo=bar"></iframe> and
 a submit button <input type="image" src="http://www.example.com/d.jpg?boo=far&what=now" alt="Submit">
</div>
HTML;

$newUrl = 'https://www.example.com/new.jpg';

$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//img/@src") as $src) {
    $src->value = $newUrl;
}

echo $dom->saveHTML();

Output (after two qualifying replacements):

<div>
Here is an img tag with no qs <img src="https://www.example.com/new.jpg">,
 an img with no src <img title="to be determined">,
 and here is another with a qs <img src="https://www.example.com/new.jpg">.
Here is a <iframe src="http://www.example.com/c.jpg?foo=bar"></iframe> and
 a submit button <input type="image" src="http://www.example.com/d.jpg?boo=far&amp;what=now" alt="Submit">
</div>

Upvotes: 0

jwueller
jwueller

Reputation: 30996

You can use PHP's preg_quote()-function instead of str_replace(). It automatically escapes all regular expression special characters (see the docs). That should solve the problem, since your str_replace()-solution did not escape ?, which is a special character in regular expressions:

$match_pattern = '/<img[^<]*src\s*=\s*\"'.preg_quote($source_url, '/').'\"[^>]*>/iUu';

Upvotes: 2

Related Questions