ASH
ASH

Reputation: 20342

How can I concatenate data from a web site into a data frame?

I'm trying to loop through 10 web pages and concatenate all the data into one single file.

Below is my non-working code. How can I make this loop through all 10 pages, and append the data from the next page under the previous page?

library(XML)
library(plyr)

for(i in 1:10)
{
NHL <- htmlParse("http://www.hockey-reference.com/friv/dailyleaders.cgi?month=1&day=i&year=2014")
class(NHL)
NHL.tables <- readHTMLTable(NHL,stringAsFactors = FALSE)
length(NHL.tables)
head(rbind.fill(NHL.tables))
write.csv(NHL.tables, file = "NHLData.csv")
}

I thought it was an issue of pulling the data, 1 URL at a time, and binding it together as I go, but it doesn't seem to work. I'm sure I'm missing something simple. Any thoughts? Thank you.

Upvotes: 0

Views: 40

Answers (1)

cory
cory

Reputation: 6669

This should get you close. There's two tables per page, I took the biggest one to be the one you wanted...

library(XML)
df <- NULL
for(i in 1:10)
{
  url <- paste0("http://www.hockey-reference.com/friv/dailyleaders.cgi?month=1&day=",
                i, "&year=2014")
  NHL <- htmlParse(url)
  NHL.tables <- readHTMLTable(NHL,stringAsFactors = FALSE)
  df <- rbind(df, NHL.tables[[1]])
}
write.csv(df, file = "NHLData.csv")

Upvotes: 1

Related Questions