Reputation: 31
I am attempting to parse through a table on a website for a give table row in which the first column matches a certain string of characters. Below is the HTML for part of the table (it's very larger)
<table class="table display datatable" id="datatable1">
<thead>
<tr>
<th class="va-m">Miner</th>
<th class="va-m">Shares</th>
<th class="va-m">%</th>
<th class="va-m">Best DL</th>
</tr>
</thead>
<tfoot>
<tr>
<th class="va-m">Miner</th>
<th class="va-m">Shares</th>
<th class="va-m">%</th>
<th class="va-m">Best DL</th>
</tr>
</tfoot>
<tbody>
<tr>
<td>3R8RDBxiux3g1pFCCsQnm2vwD34axsVRTrEWzyX8tngJaRnNWkbnuFEewzuBAKhQrb3LxEQHtuBg1zW4tybt83SS</td>
<td>44279</td>
<td>27.37 %</td>
<td>1154</td>
</tr>
<tr>
<td>5gwVxC9cXguHHjD9wtTpHfsJPaZx4fPcvWD5jGWF1dcuHnAMyXxteaHrEtXviZkvWN3FAnevbVLErABSsP6mS7PR</td>
<td>36369</td>
<td>22.48 %</td>
<td>2725</td>
</tr>
<tr>
<td>2qZXPmop82UiA7LQEQqdoUzjFbcwCSpqf8U1f3656XXSsHnGvGXYTNoP11s2asiVSyVS8LPFqxmpdCeSNxcpFMnF</td>
<td>28596</td>
<td>17.68 %</td>
<td>967</td>
</tr>
<tr>
<td>21mbNSDo7g9BAyjsZGxnNfJUrEtBUVVNQZhR4tkVwdEHPaMNsa2u2JHQPAAe5riGfPA9Khb1Pq3bQGhqmrLEGNqN</td>
<td>6104</td>
<td>3.77 %</td>
<td>4787</td>
</tr>
<tr>
<td>4HAakKK7dSq18Djg7m6cLSyHb5aUU6ngvBQimo8pYyF5F64qX3gE4T8q8kfWHTZ79FvXybSG3JhUfSZDDv2sRwqY</td>
<td>5895</td>
<td>3.64 %</td>
<td>6020</td>
</tr>
<tr>
<td>2r2izPEC5o7ZDnUsdDA97q8wKCeZRRg9n243Rd9vkMQqRCtc6ZRUTruQUyZGduoHy8pTYPuEq9ACXPKfXt8fqKxS</td>
<td>5605</td>
<td>3.46 %</td>
<td>10958</td>
</tr>
</tbody>
</table>
I am trying to step through the table and search for a specific row but I am receiving an IndexOutOfBoundsException
.
Would there be a better way to code the statement below?
for (Element table : doc.select("table")){
for(Element row : table.select("tr")){
Elements tds = row.select("td");
if(tds.get(0).text().equals("4HjSN79KUMz7AQC3GBvGkgPa5Qrio9HWTh7hg9JY48fkrYeVZJVmzB9YCB6GZSpuXB7V7DjJVuke3ZaCm5k7sRLE")){
myHistoricShares =tds.get(0).text();
}
}
}
Upvotes: 3
Views: 453
Reputation: 8481
As I said in comments, your table.select("tr")
selects rows not only inside <tbody>
, but inside the header and the footer too. For those rows row.select("td")
returns an empty list, and hence tds.get(0)
throws the IndexOutOfBoundsException
.
You could simplify your loop by selecting only the rows in <tbody>
:
for (Element row: doc.select("table#datatable1>tbody>tr")) {
if (row.children().size() > 0 && "some_long_string".equals(row.child(0).text())) {
doSomething();
}
}
The selector "table#datatable1>tbody>tr"
selects the table with id="datatable1"
, then its exact tbody
child and then all its exact tr
children. So you only need to iterate through them once.
Upvotes: 2