Connor Shipp
Connor Shipp

Reputation: 31

Parsing Table with Jsoup for Android App

I am attempting to parse through a table on a website for a give table row in which the first column matches a certain string of characters. Below is the HTML for part of the table (it's very larger)

<table class="table display datatable" id="datatable1">
    <thead>
        <tr>
            <th class="va-m">Miner</th>
            <th class="va-m">Shares</th>
            <th class="va-m">%</th>
            <th class="va-m">Best DL</th>
        </tr>
    </thead>
    <tfoot>
        <tr>
            <th class="va-m">Miner</th>
            <th class="va-m">Shares</th>
            <th class="va-m">%</th>
            <th class="va-m">Best DL</th>
        </tr>
    </tfoot>
    <tbody>
        <tr>
            <td>3R8RDBxiux3g1pFCCsQnm2vwD34axsVRTrEWzyX8tngJaRnNWkbnuFEewzuBAKhQrb3LxEQHtuBg1zW4tybt83SS</td>
            <td>44279</td>
            <td>27.37 %</td>
            <td>1154</td>
        </tr>
        <tr>
            <td>5gwVxC9cXguHHjD9wtTpHfsJPaZx4fPcvWD5jGWF1dcuHnAMyXxteaHrEtXviZkvWN3FAnevbVLErABSsP6mS7PR</td>
            <td>36369</td>
            <td>22.48 %</td>
            <td>2725</td>
        </tr>
        <tr>
            <td>2qZXPmop82UiA7LQEQqdoUzjFbcwCSpqf8U1f3656XXSsHnGvGXYTNoP11s2asiVSyVS8LPFqxmpdCeSNxcpFMnF</td>
            <td>28596</td>
            <td>17.68 %</td>
            <td>967</td>
        </tr>
        <tr>
            <td>21mbNSDo7g9BAyjsZGxnNfJUrEtBUVVNQZhR4tkVwdEHPaMNsa2u2JHQPAAe5riGfPA9Khb1Pq3bQGhqmrLEGNqN</td>
            <td>6104</td>
            <td>3.77 %</td>
            <td>4787</td>
        </tr>
        <tr>
            <td>4HAakKK7dSq18Djg7m6cLSyHb5aUU6ngvBQimo8pYyF5F64qX3gE4T8q8kfWHTZ79FvXybSG3JhUfSZDDv2sRwqY</td>
            <td>5895</td>
            <td>3.64 %</td>
            <td>6020</td>
        </tr>
        <tr>
            <td>2r2izPEC5o7ZDnUsdDA97q8wKCeZRRg9n243Rd9vkMQqRCtc6ZRUTruQUyZGduoHy8pTYPuEq9ACXPKfXt8fqKxS</td>
            <td>5605</td>
            <td>3.46 %</td>
            <td>10958</td>
        </tr>
    </tbody>
</table>

I am trying to step through the table and search for a specific row but I am receiving an IndexOutOfBoundsException.

Would there be a better way to code the statement below?

for (Element table : doc.select("table")){
    for(Element row : table.select("tr")){
        Elements tds = row.select("td");
        if(tds.get(0).text().equals("4HjSN79KUMz7AQC3GBvGkgPa5Qrio9HWTh7hg9JY48fkrYeVZJVmzB9YCB6GZSpuXB7V7DjJVuke3ZaCm5k7sRLE")){
            myHistoricShares =tds.get(0).text();
        }
    }
}

Upvotes: 3

Views: 453

Answers (1)

Kirill Simonov
Kirill Simonov

Reputation: 8481

As I said in comments, your table.select("tr") selects rows not only inside <tbody>, but inside the header and the footer too. For those rows row.select("td") returns an empty list, and hence tds.get(0) throws the IndexOutOfBoundsException.

You could simplify your loop by selecting only the rows in <tbody>:

for (Element row: doc.select("table#datatable1>tbody>tr")) {
    if (row.children().size() > 0 && "some_long_string".equals(row.child(0).text())) {
        doSomething();
    }
}

The selector "table#datatable1>tbody>tr" selects the table with id="datatable1", then its exact tbody child and then all its exact tr children. So you only need to iterate through them once.

Upvotes: 2

Related Questions