oercim
oercim

Reputation: 1848

Exrtacting data from a web using R

I am trying to extract data from

http://www.covers.com/sports/NCAAB/matchups?selectedDate=2015-02-28

I am using th below code:

library(XML)
library(RCurl)

url1<-"http://www.covers.com/sports/NCAAB/matchups?selectedDate=2015-02-28"
data1<-htmlTreeParse(url1)
competype<-xpathSApply(xmlRoot(data1),"//div[@class = 'data-competition-type']")

However, competype outputs as an empty list.

A part of the data1 is like below:

  <div class="cmg_matchup_game_box" data-home-score="54" data-away-score="51" data-event-id="888836" data-index="147" data-following="false" data-last-update="2015-03-01T03:12:09.0000000" data-link="/Sports/NCAAB/Matchups/888836" data-handicap-difference="0.5" data-game-odd="-3.5" data-game-total="128" data-line-moves="7" data-sdi-event-id="/sport/basketball/competition:888836" data-game-date="2015-02-28 23:59:00" data-top-25="false" data-competition-type="Regular Season" data-conference="Big West" data-home-conference="Big West" data-away-conference="Big West">

I want to extract "game-competition-type". How can I do that using R? I will be very glad for any help. Thanks a lot.

Upvotes: 1

Views: 89

Answers (1)

user6022341
user6022341

Reputation:

This should work:

nodes <- getNodeSet(xmlRoot(data1),"//div[@class = 'cmg_matchup_game_box']")
sapply(nodes, xmlGetAttr, "data-competition-type")

Upvotes: 1

Related Questions