Anthony Amico
Anthony Amico

Reputation: 83

Bind tables in a loop in R

I am trying to pull data in large quantities from Sports Reference. My coding background is pretty weak, as I am self-taught in only a few processes. I have figured out how to pull the data from SR using the htmltab() function, and can create a table from each page on the website.

My issue is combining the tables at the end. I know the code below only uses 5 pages, and would be very easy to combine using rbind(), but it is just a small test sample.

I will ultimately have thousands of tables to combine, so it isn't practical to rbind them manually at the end. Is there a way to tack on each new table to some composite table at each step of the loop (or to easily bind them at the end without typing out thousands of tables)?

Alternatively, if I could just combine all of the data into a single table without having to make a thousand of them first would seem to be more efficient, but I have no idea how to do that (obviously).

Any help is appreciated!

(For those not familiar with SR, the site groups their tables by 100 elements, hence the i*100 and paste with the first part of the URL)

for (i in 1:5) {
     a <- i*100
     url <- paste("https://www.sports-reference.com/cfb/play-index/pgl_finder.cgi?request=1&match=game&year_min=&year_max=&conf_id=&school_id=&opp_id=&game_type=&game_num_min=&game_num_max=&game_location=&game_result=&class=&c1stat=rush_att&c1comp=gt&c1val=0&c2stat=rec&c2comp=gt&c2val=0&c3stat=punt_ret&c3comp=gt&c3val=0&c4stat=kick_ret&c4comp=gt&c4val=0&order_by=date_game&order_by_asc=&offset=",a,sep = "")
     nam <- paste("ploop",i,sep = "")
     assign(nam,htmltab(url))
     ??????
     }

Upvotes: 0

Views: 949

Answers (2)

Jos&#233;
Jos&#233;

Reputation: 931

You also can try the tidyverse way:

url <- "https://www.sports-reference.com/cfb/play-index/pgl_finder.cgi?request=1&match=game&year_min=&year_max=&conf_id=&school_id=&opp_id=&game_type=&game_num_min=&game_num_max=&game_location=&game_result=&class=&c1stat=rush_att&c1comp=gt&c1val=0&c2stat=rec&c2comp=gt&c2val=0&c3stat=punt_ret&c3comp=gt&c3val=0&c4stat=kick_ret&c4comp=gt&c4val=0&order_by=date_game&order_by_asc=&offset="

df <- purrr::map_dfr(1:5,~htmltab::htmltab(paste0(url,.x*100)))

Upvotes: 1

jdobres
jdobres

Reputation: 11957

In situations like this, it's often best to store your results in a list rather than mucking around with assign. Here we store the result of each iteration of the loop in a list, and then use do.call with rbind to create a single data frame:

rm(list = ls())
library(htmltab)

tables <- list()
for (i in 1:5) {
  a <- i*100
  url <- paste("https://www.sports-reference.com/cfb/play-index/pgl_finder.cgi?request=1&match=game&year_min=&year_max=&conf_id=&school_id=&opp_id=&game_type=&game_num_min=&game_num_max=&game_location=&game_result=&class=&c1stat=rush_att&c1comp=gt&c1val=0&c2stat=rec&c2comp=gt&c2val=0&c3stat=punt_ret&c3comp=gt&c3val=0&c4stat=kick_ret&c4comp=gt&c4val=0&order_by=date_game&order_by_asc=&offset=",a,sep = "")
  tables[[i]] <- htmltab(url)
}

table.final <- do.call(rbind, tables)

str(table.final)

'data.frame':   520 obs. of  20 variables:
 $ Rk              : chr  "101" "102" "103" "104" ...
 $ Player          : chr  "Myles Gaskin" "Willie Gay" "Jake Gervase" "Kyle Gibson" ...
 $ Date            : chr  "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01" ...
 $ G#              : chr  "14" "13" "13" "13" ...
 $ School          : chr  "Washington" "Mississippi State" "Iowa" "Central Florida" ...
 $ V2              : chr  "N" "N" "N" "N" ...
 $ Opponent        : chr  "Ohio State" "Iowa" "Mississippi State" "Louisiana State" ...
 $ V2.1            : chr  "L" "L" "W" "L" ...
 $ Rushing >> Att  : chr  "24" "0" "0" "0" ...
 $ Rushing >> Yds  : chr  "121" "0" "0" "0" ...
 $ Rushing >> TD   : chr  "2" "0" "0" "0" ...
 $ Receiving >> Rec: chr  "3" "0" "0" "0" ...
 $ Receiving >> Yds: chr  "-1" "0" "0" "0" ...
 $ Receiving >> TD : chr  "0" "0" "0" "0" ...
 $ Kick Ret >> Ret : chr  "0" "0" "0" "0" ...
 $ Kick Ret >> Yds : chr  "0" "0" "0" "0" ...
 $ Kick Ret >> TD  : chr  "0" "0" "0" "0" ...
 $ Punt Ret >> Ret : chr  "0" "0" "0" "0" ...
 $ Punt Ret >> Yds : chr  "0" "0" "0" "0" ...
 $ Punt Ret >> TD  : chr  "0" "0" "0" "0" ...

Upvotes: 1

Related Questions