greysqrl
greysqrl

Reputation: 977

JSOUP - Accessing specific elements within a div class

I am trying to access elements within an HTML file in Android. I have retrieved the document using volley (stringRequest) and am now trying to parse the document using JSOUP.

The HTML document has some code within it as follows:

<div class="theProducts"> 
    <h3>
        <a href="http://www.myproduct.com/myproduct.html" >
            This is the product information I want to access
            <img src="http://prettypictures.myproduct.com/myproduct.jpg" alt=""  />
        </a>
    </h3>
</div>

I am able to access 'theproducts' contained within the document by doing the following:

    Document doc = Jsoup.parse(response);

    String title = doc.title();
    Elements productElements = doc.getElementsByClass("theProducts");

    for (Element productElement : productElements) {
        //String name = productElement.attr("name");
        //String content = productElement.attr("content");
    }

So, I do receive an array of productElements quite happily. I am however not sure how to access the specific element I want (i.e. 'This is the product I want to access'). I can see it nested within the array but it's deeply nested.

Is anyone please able to explain to me the correct syntax to use. I'm not all that familiar with the DOM model thus am getting a little confused. I did try doc.getElementsByClass(theProducts.h3) and (theProducts#h3) but neither of these worked and instead I got 0 results.

I also tried to access outerHtml however this returns me the entire <h3> section.

Any help is greatly appreciated.

Upvotes: 0

Views: 1312

Answers (2)

ELITE
ELITE

Reputation: 5940

The easy way to get elements you want is

Elements els = doc.select("div.theProducts>h3>a");
for(Element el : els) {
    System.out.println(el.text());
}

Here first line doc.select("div.theProducts>h3>a") will give all the div tags with class theProducts and having h3 and child and anchor as child of h3 element.

EDIT::For more details about selector tags

read this link

Upvotes: 1

greysqrl
greysqrl

Reputation: 977

A bit more searching and I found the answer here:

Parse the inner html tags using jSoup

I'll go and upvote it now!

Posting the answer here in the context of my question (as found on that page)...

Elements headlinesCat1 = doc.getElementsByTag("h3");
for (Element headline : headlinesCat1) {
    Elements importantLinks = headline.getElementsByTag("a");
    for (Element link : importantLinks) {
        String linkHref = link.attr("href");
        String linkText = link.text(); //THIS IS THE TEXT I WANTED...
        System.out.println(linkHref);
    }
}

Upvotes: 0

Related Questions