Aman
Aman

Reputation: 838

Extracting data from a div-class or div-id end with content with jsoup

Suppose there is two html page. page1 and page2

page1 contains html like

<div class="content">
<p></p>
<p></p>
</div>

and page2 contains html like

<div id="main-content">
<p></p>
<p></p>
<p></p>
</div>

now i wrote a jsoup parser like

Document document = Jsoup.connect(url).get();
             Elements links = document.select("div[class~=content$]");

              for (Element heading2 : links) {
                Elements p = heading2.select("p");
                 for (Element ptext : p) {
                    System.out.println(ptext.text());
                             }
                                 }

in this code it says if div class contains content at the end then it parse the data but when page contains div id="content" it cant pasre the data..i know ofcourse it cant... my question is there is any way to check whether the <div> id or class contains contentat the end of div id/class then parse it?

Upvotes: 0

Views: 633

Answers (1)

Pshemo
Pshemo

Reputation: 124285

You could use comma to specify few selectors independent of each other. So you can create one which will search for <div id="main-content"> and other for <div class="content">

.select("div[id~=content$], div[class~=content$]");

Upvotes: 1

Related Questions