Ascalonian
Ascalonian

Reputation: 15174

Is there a way to get multiple elements deep in one Jsoup select?

I am looking at a generic looking HTML table containing some values I need to extract.

An example of the table looks like:

<table width="100%" class="list"> 
 <tbody>
  <tr> 
   <td><font style="font-family:Verdana; color:black; font-size:8pt; "><label>Project Number</label></font></td> 
   <td><font style="font-family:Verdana; color:black; font-size:8pt; ">123456</font> </td> 
  </tr> 
  <tr height="22"> 
   <td><font style="font-family:Verdana; color:black; font-size:8pt; "><label>Report Number</label></font></td> 
   <td><font style="font-family:Verdana; color:black; font-size:8pt; ">REP445566</font></td> 
  </tr> 
</table>

What I want to do is get the values pulled out from the second <td> tag. I don't want to have to do an Element for the table, then another for the <tr> tag, another for the <td> then another for the <font>. I am curious if there is a way to select something like "tr > td > font" so I can avoid having to create multiple Elements to drill down to the value of the font.

What I have so far is :

Elements listTables = doc.getElementsByClass("list");

// There is a table above the one I want to use
Element mainTable = listTables.get(1);

Elements trs = mainTable.select("tr");

for (Element tr : trs) {
    Elements tds = tr.select("td");

    Element label = tds.get(0);

    if (tds.size() > 1) {
        Element value = tds.get(1);
        // This gets me the td, now I need the value of the font
    }


}

Upvotes: 1

Views: 569

Answers (2)

Mark Lee
Mark Lee

Reputation: 301

Sure, Jsoup is very powerful! Try this

String tdPath = "table > tbody > tr > td:nth-child(2)";
Elements secondTd = doc.select(tdPath);

Upvotes: 2

LittlePanda
LittlePanda

Reputation: 2507

You need to use a css selector query:

Elements e = d.select("table.list > tbody > tr > td + td");
for(int i=0;i<e.size();i++)
System.out.println(e.get(i).text());

Output:

123456
REP445566

Upvotes: 2

Related Questions