Two13
Two13

Reputation: 307

Jsoup get comment before element

Say I have this html:

<!-- some comment -->
<div class="someDiv">
... other html
</div>
<!-- some comment 2 -->
<div class="someDiv">
... other html
</div>

I'm currently getting all divs where class == someDiv and scraping them for information. To do that I'm simply doing this:

Document doc = Jsoup.connect(url).get();
Elements elements = doc.select(".someDiv");
for (Element element : elements) {
    //scrape stuff
}

Within the for loop, is there any way to get the comment tag found before the particular div.someDiv element I'm on?

If this isn't possible, should I go about parsing this html structure differently with this requirement?

Thanks for any advice.

Upvotes: 2

Views: 4038

Answers (3)

Daniel
Daniel

Reputation: 37061

Try something like this, Iterate over all comments and check if their sibling is the div you were after

for (int i = 0; i < doc.childNodes().size(); i++) {
        Node child = doc.childNode(i);
        if (child.nodeName().equals("#comment")) {
            //do some checking on child.nextSibling() , like hasAttr or attr to figure out if it the div you were expecting for...
        }
}

Take a look at the jsoup Node docs

Upvotes: 2

korpe
korpe

Reputation: 353

Though this question is a few month old here my answer for completeness. How about using previousSibling to get the preceding Node. Of course in the real code you probably want to check, whether you really get a Comment there.

String html = "<!-- some comment --><div class=\"someDiv\">... other html</div><!-- some comment 2 --><div class=\"someDiv\">... other html</div>";
Document doc = Jsoup.parseBodyFragment(html);
Elements elements = doc.select(".someDiv");
for (Element element : elements) {
    System.out.println(((Comment) element.previousSibling()).getData());
}

This produces:

some comment 
some comment 2 

(tested with jsoup 1.6.1 and 1.6.3)

Upvotes: 4

Neil-MS
Neil-MS

Reputation: 1

Elements elements = doc.select("div.someDiv");

http://jsoup.org/cookbook/

Upvotes: 0

Related Questions