Luke Fox
Luke Fox

Reputation: 31

Trouble reading multiple rds files at the same time and saving them to a single data frame

I have a for loop set up for some rds files and I need to find a way to read said files. I could manually merge them with rbind as well, but it would be preferable if I could have them already together.

YearsSeq <- seq(2010,2020,1)
for (year in YearsSeq) {
  Allrds <- paste0('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_', sprintf('%02d', YearsSeq), '.rds')
}

With the code above I can save the rds files to Allrds so that Allrds[1] is data from 2010, Allrds[2] is data from 2011, etc.

Using readRDS(Allrds) doesn't work, it comes back with the error "Error in gzfile(file, "rb") : invalid 'description' argument"

Any help would be appreciated!!

Upvotes: 0

Views: 363

Answers (1)

neilfws
neilfws

Reputation: 33822

You have a couple of issues here. First, an apply function such as lapply will work better than a loop. Second, you cannot read RDS directly from a github URL - you need to specify a connection using url.

So you can read into a list of data frames like this:

Allrds <- lapply(2010:2020, function(x) readRDS(url(paste0("https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_", x, ".rds"))))

And you can bind into a single data frame like this:

Allrds <- do.call(rbind, Allrds)

# check size of data
dim(Allrds)

[1] 529480    340

You will lose the year information when you rbind, but I see the data contains season so that is not an issue.

Upvotes: 1

Related Questions