Reputation: 24195
I have the following code that gets the img tags urls from an XML which is working correctly:
Pattern p = Pattern.compile("<img[^>]+src\\s*=\\s*['\"]([^'\"]+)['\"][^>]*>");
Matcher m = p.matcher(xmlString);
while (m.find())
imagesURLs.add(m.group(1));
My xml looks like the following:
<item>
<desc>
txt txt txt txt <img src="htttp://mysite.com/images/img.png"> txt txt
<img src="htttp://mysite.com/images/img.png"> ...
</desc>
</item>
<item>
<desc>
txt txt txt txt <img src="htttp://mysite.com/images/img.png"> txt txt
<img src="htttp://mysite.com/images/img.png"><img src="htttp://mysite.com/images/img.png">
</desc>
</item>
I want to modify the code to only get the first img tag url from each desc tag.
Upvotes: 0
Views: 259
Reputation: 13541
Instead of trying to use a regex to figure this out (Which is a very POOR way to do this...) You should actually parse the xml using some Xml parsing library as provided by java. Like XmlPullParser.
Upvotes: 2