takotsubo
takotsubo

Reputation: 746

Extract data from HTML table with Jsoup

I want to extract information from this table:

enter image description here

HTML code for this table:

<tr>
 <th>Rank</th>
 <th>Level</th>
 <th>IVs (A/D/S)</th>
 <th>CP</th>
 <th class="hidden-sm">Att</th>
 <th class="hidden-sm">Def</th>
 <th class="hidden-sm">Sta</th>
 <th class="hidden-xs">Stat Product</th>
 <th>% Max Stat</th>
</tr>
<tr class="table-danger">
 <td><b>2997</b></td>
 <td>19.0</td>
 <td>12 / 0 / 5</td>
 <td>1496</td>
 <td class="hidden-sm">128.10</td>
 <td class="hidden-sm">101.90</td>
 <td class="hidden-sm">133</td>
 <td class="hidden-xs">1736099</td>
 <td>93.71%</td>
</tr>
<tr>
 <td>1</td>
 <td>19.0</td>
 <td>0 / 14 / 14</td>
 <td>1498</td>
 <td class="hidden-sm">121.11</td>
 <td class="hidden-sm">110.05</td>
 <td class="hidden-sm">139</td>
 <td class="hidden-xs">1852687</td>
 <td>100.00%</td>
</tr>
...

I only can get this table and rows with this code:

Element table = document.select("table").get(0);
Elements rows = table.select("tr");

How to extract these stats? It should be:

Rank(2997) | Level (19.0) | IVs (12/0/5) | CP (1496) ...

With

Elements td = rows.select("td");
String stats = td.text();

I'll get one-line string: 2997 19.0 12 / 0 / 5 1496 128.10 101.90 133 1736099 93.71% 1 19.0 0... and it's hard to work with information.

I guess, I need to store them as Stat object with these fields and put it into Arraylist or smth.

But firstly, I need to extract this data more smoothly and don't put everything on one line. I need the power of Jsoup.

Upvotes: 0

Views: 576

Answers (1)

Olexandr Kisurin
Olexandr Kisurin

Reputation: 56

You were on the right track, but did not reach the end. Elements is a regular ArrayList that can be looped through.
Let's write the class Stat. Objects of this class will store the data of each row. You can also write getters, setters, and other methods for your business logic:

public class Stat {
    private String rank;
    private String level;
    private String ivs;
    private String cp;
    private String att;
    private String def;
    private String sta;
    private String statProduct;
    private String maxStat;

    public Stat(String rank, String level, String ivs, String cp, String att, String def, String sta, String statProduct, String maxStat) {
        this.rank = rank;
        this.level = level;
        this.ivs = ivs;
        this.cp = cp;
        this.att = att;
        this.def = def;
        this.sta = sta;
        this.statProduct = statProduct;
        this.maxStat = maxStat;
    }

    @Override
    public String toString() {
        return "Stat{" +
                "rank='" + rank + '\'' +
                ", level='" + level + '\'' +
                ", ivs='" + ivs + '\'' +
                ", cp='" + cp + '\'' +
                ", att='" + att + '\'' +
                ", def='" + def + '\'' +
                ", sta='" + sta + '\'' +
                ", statProduct='" + statProduct + '\'' +
                ", maxStat='" + maxStat + '\'' +
                '}';
    }
}

It remains only to loop through the array. Continuation of your code:

Elements rows = table.select("tr");

            for (int i = 0; i < rows.size(); i++) {
                Element row = rows.get(i);
                Elements td = t.getAllElements();
                Stat stat = new Stat(
                        td.get(1).text(),
                        td.get(2).text(),
                        td.get(3).text(),
                        td.get(4).text(),
                        td.get(5).text(),
                        td.get(6).text(),
                        td.get(7).text(),
                        td.get(8).text(),
                        td.get(9).text()
                );
                
                System.out.println(stat);
            }

Upvotes: 1

Related Questions