Reputation: 3075
I want a string from HTML content. I am showing some part of HTML string here:
<div style="border: 1px solid #999999; margin: 0px 10px 5px 0px;">
<a href="http://www.youtube.com">
<img alt="" src="http://someImage.jpg">
</a>
</div>
I am getting this as a string from SAX parsing. Now I want only the image path : "http://someImage.jpg" as a string.
Please anybody help me. How to get that string?
Upvotes: 0
Views: 2039
Reputation: 94643
Try the jsoup parser.
Document doc=Jsoup.connect("http://www.yahoo.com").get();
Elements elements=doc.select("img");
for(Element e:elements)
{
System.out.println(e.attr("src"));
}
Or just use Jsoup.parse(html_text) to get an instance of Document.
Document doc=Jsoup.parse(html_string);
...
Upvotes: 0
Reputation: 56925
I think you need to use regular expression . Here i post some code . Please check it.
String subjectString = "<a href=\"http://www.youtube.com\"><img alt=\"\" src=\"http://someImage.jpg\"></a>";
Code for Getting Href Link from Image Tag
Pattern titleFinder = Pattern.compile("<a[^>]*?href\\s*=\\s*((\'|\")(.*?)(\'|\"))[^>]*?(?!/)>", Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
Matcher regexMatcher = titleFinder.matcher(subjectString);
while (regexMatcher.find()) {
Log.i("==== Link0",regexMatcher.group(1));
}
Code for Getting Image Path from Image Tag
Pattern titleFinder = Pattern.compile("<img[^>]+src\\s*=\\s*['\"]([^'\"]+)['\"][^>]*>" , Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
Matcher regexMatcher = titleFinder.matcher(subjectString);
while (regexMatcher.find())
{
Log.i("==== Image Src",regexMatcher.group(1));
}
Upvotes: 3