Reputation: 37
I'm trying to scrape this webpage http://www.skysports.com/football/competitions/la-liga/table.I just want the name of teams from the table. I'm using Jsoup for this purpose. Here's my code
private class LoadData extends AsyncTask<Void,Void,Void> {
String url = "http://www.skysports.com/football/competitions/la-liga/table";
String data = "";
@Override
protected Void doInBackground(Void... params) {
Document document;
try {
document = Jsoup.connect(url).timeout(0).get();
Elements clubName = document.select("td.standing-table__cell standing-table__cell--name");
int a = clubName.size();
for(int i = 0; i < a; i++) {
data += "\n\n" +clubName.get(i).text();
}
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(Void result) {
teamview = (TextView) findViewById(R.id.club_view);
teamview.setMovementMethod(new ScrollingMovementMethod());
teamview.setText(data);
super.onPostExecute(result);
}
}
and here's its html code
<tr class="standing-table__row" data-item-id="872">
<td class="standing-table__cell">1</td>
<td class="standing-table__cell standing-table__cell--name" data-short-name="Atletico Madrid" data-long-name="Atletico Madrid">
<a href="/football/teams/atletico-madrid" class="standing-table__cell--name-link">Atletico Madrid</a>
</td>
<td class="standing-table__cell">19</td>
<td class="standing-table__cell is-hidden--bp35">14</td>
<td class="standing-table__cell is-hidden--bp35">2</td>
<td class="standing-table__cell is-hidden--bp35">3</td>
<td class="standing-table__cell is-hidden--bp35">27</td>
<td class="standing-table__cell is-hidden--bp35">8</td>
<td class="standing-table__cell">19</td>
<td class="standing-table__cell" data-sort-value="1">44</td>
<td class="standing-table__cell is-hidden--bp15 is-hidden--bp35 " data-sort-value="15333033">
<div class="standing-table__form">
<span title="Granada 0-2 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Atletico Madrid 2-1 Athletic Bilbao" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Malaga 1-0 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--loss"> </span><span title="Rayo Vallecano 0-2 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Atletico Madrid 1-0 Levante" class="standing-table__form-cell standing-table__form-cell--win"> </span><span title="Celta Vigo 0-2 Atletico Madrid" class="standing-table__form-cell standing-table__form-cell--win"> </span> </div>
</td>
</tr>
When i use the code document.select("td.standing-table__cell");
, the data is shown. But when i use document.select("td.standing-table__cell standing-table__cell--name");
instead of document.select("td.standing-table__cell");
, no data is shown!?
Upvotes: 0
Views: 396
Reputation: 106
The code below loops through each row of the table. It then prints out, based on the css class name, the name of the club which is in the row of the table that the for loop is on.
String url = "http://www.skysports.com/football/competitions/la-liga/table";
try {
Document document = Jsoup.connect(url).timeout(0).get();
Elements clubRow = document.select("tr.standing-table__row");
for(Element club: clubRow) {
System.out.println(club.select("a.standing-table__cell--name-link").text());
}
} catch (IOException e) {
e.printStackTrace();
}
Upvotes: 0
Reputation: 11712
The selector document.select("td.standing-table__cell standing-table__cell--name");
will select all elements that have a tag name standing-table__cell--name
and that are (indirect) children of td elements with a class called standing-table__cell
. None such elements exist and so Jsoup returns an empty list.
What you probably want is to select td elements with both classes standing-table__cell
and standing-table__cell--name
. This can be done with CSS selectors like this:
document.select("td.standing-table__cell.standing-table__cell--name");
Note: The dot followed by a class name is the CSS selector for a class. They can be concatenated.
Upvotes: 1