Reputation: 2725
I've been through countless examples and questions here, and I'm still unable to parse the following page using JSOUP:
I am trying to get the following lines from the page (highlighted in blue):
and
Edit: I've tried a selector as shown:
Document doc = Jsoup.connect("http://solr.cbssports.com/solr/select/?q=fantasy%20Tom%20Brady")
.timeout(30000)
.get();
Element resultLinks = doc.select("#docs > div:nth-child(1) > img").first();
Log.i(TAG, "img: " + resultLinks);
However, the above returns null.
Any help that can be provided would be greatly appreciated.
Thank you Josh
Upvotes: 0
Views: 108
Reputation: 25340
The first one is not possible, since there's no img
tag in the linked document.
You get the second element with following code:
Document doc = Jsoup.connect("http://solr.cbssports.com/solr/select/?q=fantasy%20Tom%20Brady")
.timeout(30000)
.get();
Element tomBrady = doc.select("str[name=content]:matchesOwn(12 Tom Brady, QB Player Page)").first();
System.out.println(tomBrady);
Here's the only element containing an url to an image from the linked content:
<str name="img_url">http://sports.cbsimg.net/images/football/nfl/players/60x80/187741.jpg</str>
As stated in the comments, curiously the elements i get from the link differ from yours.
Since this is the only str
-tag with name=img_url
you can simply take the first one you find (= first()
):
Element imgUrl = doc.select("str[name=img_url]").first();
String url = imgUrl.text();
If there's a possibility of more img_url
's, you better select the top node you need and select the proper url-element from those.
Upvotes: 1