Reputation: 203
I'd like to get data from a HTML table which looks like this:
<tr>
<td rowspan="30" class="listWeekday">Mo</td>
<td class="listStart">05:00</td>
<td class="listEnd">08:30</td>
</tr>
<tr>
<td... unknown value of Start and End td's> </td></tr>
<tr>
<td rowspan="30" class="listWeekday">Tu</td>
<td.. same as Monday, continues so till Friday></td></tr>
I like to parse this table with Jsoup. I tried to use the select() method with "td.listWeekday" running in
for (Element elem : values) {
S.o.P(elem.text()); }
Works fine, but when I try to get the listStart values it collects the Data from all days, but I like to seperate them, so I get the listStart and listEnd values for each day.
I think this is possible, but I don't even have a clue where to start, because the number of listStart and listEnd's change every day.
Upvotes: 2
Views: 537
Reputation: 11712
Analyzing tables with rowspan entries is not straightforward in JSoup or any other HTML library I know. What you could do in your case is to keep a simple variable with the current day while cycling over all rows. Something like this:
String URL = "http://pastebin.com/raw/Sa2MRCTQ";
Document doc = Jsoup.connect(URL).get();
Elements trs = doc.select("tr:has(td.liste-startzeit)");
String currentDay = null;
for (Element tr : trs){
Element tdDay = tr.select("td.liste-wochentag").first();
if (tdDay!=null){
currentDay = tdDay.text();
}
Element tdStart = tr.select("td.liste-startzeit").first();
System.out.println(currentDay +" : "+tdStart.text());
}
Upvotes: 2