Kevin M
Kevin M

Reputation: 1312

Remove image from string based on src url

I am looking for an easy and efficient way to remove a specific image from an article. All that I know is the image URL of the image that I need to remove.

My choice would be either regex or DOMDocument, probably using an HTML5 parser like https://github.com/Masterminds/html5-php.

My regex skills are not that good, and I'm not sure if it's a good idea to use regex to accomplish this because I read that regex should be avoided to parse HTML. What I have with so far with regex, is to remove the complete image, but not sure how to remove it based on a specific src url.

$img_src = 'http://www.example.org/image_to_be_removed.jpg';

$article = '<h1>Test article with HTML5 tags</h1>
<nav><a href="/link1/">Link 1</a></nav>
<p>This is an example article. The article may or may not include html5 tags, images and other things.</p>
<img src="http://www.example.org/image_to_be_removed.jpg">
<p>More example text.</p>';

$article = preg_replace("/<img[^>]+\>/i", "", $article);
echo $article;

I haven't dug into the DOMDocument solution yet, because I am not sure if it's even possible or if regex might be considered best practice?

Upvotes: 1

Views: 1769

Answers (4)

The fourth bird
The fourth bird

Reputation: 163597

It is not recommended to parse html with regex.

As you suggested, you might for example use DOMDocument or for example PHP Simple HTML DOM Parser.

Because you state that "All that I know is the image URL of the image that I need to remove", you could find the src attribute of the img tag using xpath or looking for the tag name and check that.

Example DOMDocument:

$img_src = 'http://www.example.org/image_to_be_removed.jpg';
$article = '<h1>Test article with HTML5 tags</h1>
<nav><a href="/link1/">Link 1</a></nav>
<p>This is an example article. The article may or may not include html5 tags, images and other things.</p><img src="http://www.example.org/image_to_be_removed.jpg"><img src="http://www.example.org/image_not_to_be_removed.jpg"><p>More example text.</p>\';
<p>More example text.</p>';
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($article);
$xpath = new DOMXPath($dom);
$elements = $xpath->query("//img");
foreach ($elements as $elememnt) {
    if ($elememnt->getAttribute("src") === $img_src) {
        $elememnt->parentNode->removeChild($elememnt);
    }
}
echo $dom->saveHTML();

Example PHP Simple HTML DOM Parser using simple_html_dom.php:

$htmlDom = str_get_html($article);
foreach($htmlDom ->find('img[src=http://www.example.org/image_to_be_removed.jpg]') as $item) {
    $item->outertext = '';
}
$htmlDom->save();
echo $htmlDom;

Upvotes: 0

prasanna puttaswamy
prasanna puttaswamy

Reputation: 987

You can try below with str_replace

<?php
$img_src = 'http://www.example.org/image_to_be_removed.jpg';

$article = '<h1>Test article with HTML5 tags</h1>
<nav><a href="/link1/">Link 1</a></nav>
<p>This is an example article. The article may or may not include html5 tags, images and other things.</p>
<img src="http://www.example.org/image_to_be_removed.jpg">
<p>More example text.</p>';
$new = str_replace('src="http://www.example.org/image_to_be_removed.jpg"','',$article);
echo $article;
echo '<br/>';
echo $new;
?>

there is both preg_replace from your code and str_replace,to notice deference. There are other function to do the same like sprintf,strtr,str_replace and preg_replace you can use whichever suites

Upvotes: 0

Matt.G
Matt.G

Reputation: 3609

use preg_quote:

$article = preg_replace("/<img[^>]+src=\"" . preg_quote($img_src, '/') . "\"[^>]*\>/i", "", $article);

Regex Demo

php Demo

Upvotes: 3

Joseph_J
Joseph_J

Reputation: 3669

You can try this. It seems to test ok. At any rate it should give you an idea as to how to proceed.

$img_src = 'http://www.example.org/image_to_be_removed.jpg';

$article = '<h1>Test article with HTML5 tags</h1>
<nav><a href="/link1/">Link 1</a></nav>
<p>This is an example article. The article may or may not include html5 tags, images and other things.</p>
<img style="width:100px;" src="http://www.example.org/image_to_be_removed.jpg" class="myClass">
<p>More example text.</p>';

$article = preg_replace('/\s{1,}/', ' ', $article);  //Very important step to make sure only 1 space exist between any character.
$img_src = preg_replace('/\//', '\\/', $img_src); //Adds slashes to the url.
$regex = '/<img[\W\D\w]{0,}src=\"' . $img_src . '\"[\W\D\w]{0,}>\s/'; //Define the regex.
$article = preg_replace($regex, '', $article);
echo $article;

Upvotes: 0

Related Questions