The Learner
The Learner

Reputation: 3927

jsoup to get a particular id from a html file

I have a html file like

<div class="student">
<h4 id="Classnumber100" class="studentheading">
   <a id="studentlink22" href="/grade8/greg">22. Greg</a>
</h4>
<div class="studentcategories">
<div class="studentneighborhoods">
</div>
</div>
</div>

I want to use JSOUP to get the url = /grade8/greg and "22. Greg".

I tried with selector

    Elements listo = doc.select("h4 #studentlink22");

I am not able to get the values.

Actually I want to select based on Classnumber100 There are 300 records in the HTML page , with the only thing consistent is " Classnumber100.

So I want my selector to select all the hrefs and text after classnumber100. How can I do that.

I tried doc.select("class#studentheading"); and many other possibilities but they are not working

Upvotes: 0

Views: 2620

Answers (2)

Hovercraft Full Of Eels
Hovercraft Full Of Eels

Reputation: 285405

The select method looks for the html tag, here h4 and a, and then secondarily the attributes if you tell it to do so. Have you gone to the jsoup site as the use of select is well described for this situation.

e.g.

// code not tested
Elements listo = doc.select("h4[id=Classnumber100]").select("a");

String text = listo.text(); // for  "22. Greg"
String path = listo.attr("href"); // for  "/grade8/greg"

.

Upvotes: 1

user684934
user684934

Reputation:

First of all, multiple elements should not share the same id, so each of these elements should not have the id Classnumber100. However, if this is the case, then you can still select them using the selector [id=Classnumber100].

If you're only interested in the a tags inside, then you can use [id=Classnumber100] > a.

Upon re-reading the question, it appears that the h4 tags you're interested in share the class attribute of studentheading. In which case you can use the class selector, ie

doc.select(".studentheading > a")

Upvotes: 1

Related Questions