Reputation: 15174
I am looking at a generic looking HTML table containing some values I need to extract.
An example of the table looks like:
<table width="100%" class="list">
<tbody>
<tr>
<td><font style="font-family:Verdana; color:black; font-size:8pt; "><label>Project Number</label></font></td>
<td><font style="font-family:Verdana; color:black; font-size:8pt; ">123456</font> </td>
</tr>
<tr height="22">
<td><font style="font-family:Verdana; color:black; font-size:8pt; "><label>Report Number</label></font></td>
<td><font style="font-family:Verdana; color:black; font-size:8pt; ">REP445566</font></td>
</tr>
</table>
What I want to do is get the values pulled out from the second <td>
tag. I don't want to have to do an Element for the table, then another for the <tr>
tag, another for the <td>
then another for the <font>
. I am curious if there is a way to select something like "tr > td > font" so I can avoid having to create multiple Elements to drill down to the value of the font.
What I have so far is :
Elements listTables = doc.getElementsByClass("list");
// There is a table above the one I want to use
Element mainTable = listTables.get(1);
Elements trs = mainTable.select("tr");
for (Element tr : trs) {
Elements tds = tr.select("td");
Element label = tds.get(0);
if (tds.size() > 1) {
Element value = tds.get(1);
// This gets me the td, now I need the value of the font
}
}
Upvotes: 1
Views: 569
Reputation: 301
Sure, Jsoup is very powerful! Try this
String tdPath = "table > tbody > tr > td:nth-child(2)";
Elements secondTd = doc.select(tdPath);
Upvotes: 2
Reputation: 2507
You need to use a css selector query:
Elements e = d.select("table.list > tbody > tr > td + td");
for(int i=0;i<e.size();i++)
System.out.println(e.get(i).text());
Output:
123456
REP445566
Upvotes: 2