Reputation: 3011
How to write a regex expression that gets all img tags, and inside them, gets the "src" value, ignoring all the imgs tags that has a given class? Let's say I would like to get all srcs of img tags that don't have "dontGetMe" assigned to its classes (but may still have other classes)
i.e.
<img src="teste1.jpg" class="blueClass brightClass dontGetMe" />
<img src="teste2.jpg" class="blueClass" />
<img src="teste3.jpg" class="dontGetMe" />
<img src="teste4.jpg" />
On the example, my regex should get teste2.jpg and teste4.jpg.
The regex I got so far is the following (which gets all the imgs src values regardless of the presence of the "dontGetMe" class):
((?:\<img).*)(src)
! This regex will be used on a php script, so it has to run succesfully on "http://www.phpliveregex.com".
EDIT: The regex would be used in the following php function: I totally agree that regex doesn't seems to be the most clear and guaranteed way to do it, but still, my lack of php knowledge ties me with this technology.
function Advanced_lazyload($buffer)
{
(...)
$pattern = '(REGEX EXPRESSION GOES HERE)';
$buffer = preg_replace($pattern, "$1 src='temp.gif' ImageHolder", $buffer);
return $buffer;
}
Upvotes: 0
Views: 112
Reputation: 57690
Dont use regex for parsing html. The task is for xml parser.
The recommended way is to use XPath for this.
$doc = new DOMDocument();
$doc->loadHTML($html);
$dox = new DOMXPath($doc);
$elements = $dox->query('//img[not(contains(@class, "dontGetMe"))]/@src');
foreach($elements as $el){
echo $el->nodeValue, "\n";
}
Upvotes: 4