Reputation: 1005
I parse data from this web site using JSoup:
http://www.skore.com/en/soccer/england/premier-league/results/all/
I get the name of team and result, and I also need to the get name of scorer (it is under result).
I am trying it but having trouble as it is not in HTML.
Is it possible? If yes how?
Upvotes: 1
Views: 479
Reputation: 135872
The scorers infos are acquired after an AJAX request (that occurs when you click the score link). You'll have to make such request and parse the result.
For instnace, take the first game (Manchester United 1x2 Manchester City), its tag is:
<a data-y="r1-1229442" data-v="england-premierleague-manchesterunited-manchestercity-13april2013" style="cursor: pointer;">1 - 2</a>
Take data-y
, remove the leading r
and make a get request to:
http://www.skore.com/en/scores/soccer/id/<DATA-Y_HERE>?fmt=html
Such as: http://www.skore.com/en/scores/soccer/id/1-1229442?fmt=html. And then parse the result.
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class ParseScore {
public static void main(String[] args) throws Exception {
Document doc = Jsoup.connect("http://www.skore.com/en/soccer/england/premier-league/results/all/").get();
System.out.println("title: " + doc.title());
Elements dls = doc.select("dl");
for (Element link : dls) {
String id = link.attr("id");
/* check if then it is a game <dl> */
if (id != null && id.length() > 3 && "rid".equals(id.substring(0, 3))) {
System.out.println("Game: " + link.text());
String idNoRID = id.replace("rid", "");
// String idNoRID = "1-1229442";
String scoreURL = "http://www.skore.com/en/scores/soccer/id/" + idNoRID + "?fmt=html";
Document docScore = Jsoup.connect(scoreURL).get();
Elements trs = docScore.select("tr");
for (Element tr : trs) {
Elements spanGoal = tr.select("span.goal");
/* only enter if there is a goal */
if (spanGoal.size() > 0) {
Elements score = tr.select("td.score");
String playerName = spanGoal.get(0).text();
String currentScore = score.get(0).text();
System.out.println("\t\tGOAL: " + currentScore + ": " + playerName);
}
Elements spanGoalPenalty = tr.select("span.goalpenalty");
/* only enter if there is a goal */
if (spanGoalPenalty.size() > 0) {
Elements score = tr.select("td.score");
String playerName = spanGoalPenalty.get(0).text();
String currentScore = score.get(0).text();
System.out.println("\t\tGOAL: " + currentScore + ": " + playerName + " (penalty)");
}
Elements spanGoalOwn = tr.select("span.goalown");
/* only enter if there is a goal */
if (spanGoalOwn.size() > 0) {
Elements score = tr.select("td.score");
String playerName = spanGoalOwn.get(0).text();
String currentScore = score.get(0).text();
System.out.println("\t\tGOAL: " + currentScore + ": " + playerName + " (own goal)");
}
}
}
}
}
}
Output:
title: Skore : Premier League, England - Soccer Results (All)
Game: F T Arsenal 3 - 1 Norwich
GOAL: 0 - 1: Michael Turner
GOAL: 1 - 1: Mikel Arteta (penalty)
GOAL: 2 - 1: Sébastien Bassong (own goal)
GOAL: 3 - 1: Lukas Podolski
Game: F T Aston Villa 1 - 1 Fulham
GOAL: 1 - 0: Charles N´Zogbia
GOAL: 1 - 1: Fabian Delph (own goal)
Game: F T Everton 2 - 0 Queens Park Rangers
GOAL: 1 - 0: Darron Gibson
GOAL: 2 - 0: Victor Anichebe
Game: F T Reading 0 - 0 Liverpool
Game: F T Southampton 1 - 1 West Ham
GOAL: 1 - 0: Gaston Ramirez
GOAL: 1 - 1: Andrew Carroll
Game: F T Manchester United 1 - 2 Manchester City
GOAL: 0 - 1: James Milner
...
JSoup 1.7.1 was used. If using maven, add this to your pom.xml
:
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.7.1</version>
</dependency>
Upvotes: 3