Reputation: 51
HTML code
<img src="http://website/image/ngshjk.jpeg" onload="img_onload(this);" onerror="img_onerror(this);" data-pid="dynamicvalue" data-imagesize="ppew" data-error-url="http://img.comb/6/z2default.jpg" class="small_image imageZoom " alt="image" title="" id="visible-image-small" rel="dynamicvalue" data-zoom-src="http://img.comb/6/z21347.jpeg" style="display: inline;">
PHP code
preg_match_all('/<img(.*) onload="(.*)" \/s',$con,$val);
Already this page have so many img tag. so I tried to get the src of particular image using some attributes inside the img tag. i cannot be correct in preg_match_all. please correct me in getting source in the above img tag.
Upvotes: 1
Views: 785
Reputation: 15000
To get all the image tags on the page it would probably be much easier to use an HTML parsing tool like:
// load your html string
$dom = new DOMDocument();
$dom->loadHTML($your_html_here);
// find all the img tags
$imgs = $dom->getElementsByTagName('img');
// cycle through all image tags
foreach($imgs as $img) {
$src = $img->getAttribute("src");
// do something
}
Upvotes: 0
Reputation: 15000
This expression will:
data-imagesize="ppew"
data-pid="ABCDEFGHIJ"
src
attribute value.
<img\b(?=\s) # capture the open tag
(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\sdata-imagesize="ppew") # validate data-imagesize exists with a specific value
(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\sdata-pid="ABCDEFGHIJ") # validate data-pid exists with a specific value
(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\ssrc=['"]([^"]*)['"]?) # capture the src attribute value
(?:[^>=]|='[^']*'|="[^"]*"|=[^'"\s]*)*"\s?\/?> # get the entire tag
Live Example: http://www.rubular.com/r/PBJ50cax7L
Single Line Regex: <img\b(?=\s)(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\sdata-imagesize="ppew")(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\sdata-pid="ABCDEFGHIJ")(?=(?:[^>=]|='[^']*'|="[^"]*"|=[^'"][^\s>]*)*?\ssrc=['"]([^"]*)['"]?)(?:[^>=]|='[^']*'|="[^"]*"|=[^'"\s]*)*"\s?\/?>
Sample Text
Note the first line has some potentially problematic conditions
<img onmouseover=' data-imagesize="ppew" ; data-pid="ABCDEFGHIJ" ; funSwap(data-imagesize, data-pid) ; ' src="http://website/NotTheDroidYourLookingFor.jpeg" onload="img_onload(this);" onerror="img_onerror(this);" data-pid="jihgfedcba" data-imagesize="ppew" />
<img src="http://website/someurl.jpeg" onload="img_onload(this);" onerror="img_onerror(this);" data-pid="ABCDEFGHIJ" data-imagesize="ppew" />
Capture Groups
[0] = <img src="http://website/someurl.jpeg" onload="img_onload(this);" onerror="img_onerror(this);" data-pid="ABCDEFGHIJ" data-imagesize="ppew" />
[1] = http://website/someurl.jpeg
Upvotes: 0
Reputation: 71538
You might be better off using the lazy .*?
instead of the greedy .*
.
preg_match_all('/<img(.*?)\sonload="([^"]*)"/s',$con,$val);
And change the second .*
to [^"]*
instead.
.*?
matches the least number of characters until the next match (in this case onload...
) and [^"]*
matches any non quotes characters in between the quotes.
Upvotes: 3