Seth Brundle
Seth Brundle

Reputation: 170

Converting a large list with multiple elements to data frame / tibble

I need help!

I'm using the nhlapi package to scrape some boxscores. What I get either from the package or using fromJSON function is a large nested list.

I have tried everything to convert this into a data frame. I believe the issue is that when this is reshaped it introduces many NA values. I've gotten the error below several times. I'm totally stuck here.

I do not know how to replicate the problem in my own sample code, so sharing the function being used below. Thanks!

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1, 0

#install.packages("nhlapi")
#library(nhlapi)

boxscores<-nhl_games_boxscore(gameIds = 2021010001:2021010002)

The actual API end point for one of the games: https://statsapi.web.nhl.com/api/v1/game/2021010001/boxscore

Upvotes: 1

Views: 649

Answers (1)

Brian Syzdek
Brian Syzdek

Reputation: 948

I don't get an error when running your above code. It returns a large list. You can then find the list elements, which seem to be already as dataframes. For example (I just used the first item in your range):

boxscores<-nhl_games_boxscore(gameIds = 2021010001)
str(boxscores[[1]]$officials)

Returns

'data.frame':   4 obs. of  4 variables:
 $ officialType     : chr  "Referee" "Referee" "Linesman" "Linesman"
 $ official.id      : int  2303 7405 4694 7944
 $ official.fullName: chr  "Kevin Pollock" "Mitch Dunning" "Shandor Alphonso" "Caleb Apperson"
 $ official.link    : chr  "/api/v1/people/2303" "/api/v1/people/7405" "/api/v1/people/4694" "/api/v1/people/7944"

If you're getting error, maybe try restart? EDIT: per comment, here is the file structure that can be viewed by clicking on boxscores in data window. You can then click through areas of interest in the main window to expand the structure. I clicked on the teams expanding arrow to get:

boxscores
|
+--[[1]]
  |
  +--copyright
  +--teams
    |
    +--away
      |
      +-- (other options)
      +-- players
  +--officials

You then just have to build a chain specifying what you want to pull from it. Let's say you want the away team. I can get this by:

boxscores[[1]][["teams"]][["away"]][["players"]] -> away_players

This then gives a list of lists. You can drill down further and create a df by:

library(dplyr)
lapply(1:length(away_players), function(i) {
away_players[[i]][["person"]] %>% 
  data.frame()
}) %>% 
  bind_rows()

That goes through each list item in away_players pulls the person list, converts to df and then binds together in one df

Upvotes: 1

Related Questions