Reputation: 2788
I have one pull of data from an API. The data has multiple nested lists. What is an efficient way to clean up this data?
For reference I was trying to follow this post on parsing json with purrr, but it seems that my data has more nested lists so I had some difficulty with it.
> jsonRespParsed %>% dput()
list(list(GameId = 14491L, Season = 2019L, SeasonType = 3L, Day = "2019-04-14T00:00:00",
DateTime = "2019-04-14T12:00:00", Status = "Final", AwayTeamId = 11L,
HomeTeamId = 14L, AwayTeamName = "NYI", HomeTeamName = "PIT",
GlobalGameId = 30014491L, GlobalAwayTeamId = 30000011L, GlobalHomeTeamId = 30000014L,
HomeTeamScore = 1L, AwayTeamScore = 4L, TotalScore = 5L,
PregameOdds = list(), LiveOdds = list(list(GameOddId = 384105L,
Sportsbook = NULL, GameId = 14491L, Created = "2019-04-14T14:26:30",
Updated = "2019-04-14T14:54:50", HomeMoneyLine = 300L,
AwayMoneyLine = -397L, HomePointSpread = 1.7, AwayPointSpread = -1.7,
HomePointSpreadPayout = -255L, AwayPointSpreadPayout = 207L,
OverUnder = 5.1, OverPayout = -187L, UnderPayout = 157L))),
list(GameId = 14492L, Season = 2019L, SeasonType = 3L, Day = "2019-04-14T00:00:00",
DateTime = "2019-04-14T19:00:00", Status = "Final", AwayTeamId = 6L,
HomeTeamId = 16L, AwayTeamName = "TB", HomeTeamName = "CBJ",
GlobalGameId = 30014492L, GlobalAwayTeamId = 30000006L,
GlobalHomeTeamId = 30000016L, HomeTeamScore = 3L, AwayTeamScore = 1L,
TotalScore = 4L, PregameOdds = list(), LiveOdds = list(
list(GameOddId = 385269L, Sportsbook = NULL, GameId = 14492L,
Created = "2019-04-14T21:16:43", Updated = "2019-04-14T21:44:55",
HomeMoneyLine = -475L, AwayMoneyLine = 327L,
HomePointSpread = -1.7, AwayPointSpread = 1.7,
HomePointSpreadPayout = 202L, AwayPointSpreadPayout = -254L,
OverUnder = 5.1, OverPayout = -174L, UnderPayout = 146L))),
list(GameId = 14493L, Season = 2019L, SeasonType = 3L, Day = "2019-04-14T00:00:00",
DateTime = "2019-04-14T19:30:00", Status = "Final", AwayTeamId = 22L,
HomeTeamId = 20L, AwayTeamName = "WPG", HomeTeamName = "STL",
GlobalGameId = 30014493L, GlobalAwayTeamId = 30000022L,
GlobalHomeTeamId = 30000020L, HomeTeamScore = 3L, AwayTeamScore = 6L,
TotalScore = 10L, PregameOdds = list(), LiveOdds = list(
list(GameOddId = 385329L, Sportsbook = NULL, GameId = 14493L,
Created = "2019-04-14T21:49:05", Updated = "2019-04-14T22:19:58",
HomeMoneyLine = NULL, AwayMoneyLine = NULL, HomePointSpread = 3.9,
AwayPointSpread = -3.9, HomePointSpreadPayout = -272L,
AwayPointSpreadPayout = 216L, OverUnder = 8.5,
OverPayout = -226L, UnderPayout = 184L))), list(
GameId = 14494L, Season = 2019L, SeasonType = 3L, Day = "2019-04-14T00:00:00",
DateTime = "2019-04-14T22:00:00", Status = "Final", AwayTeamId = 27L,
HomeTeamId = 35L, AwayTeamName = "SJ", HomeTeamName = "VEG",
GlobalGameId = 30014494L, GlobalAwayTeamId = 30000027L,
GlobalHomeTeamId = 30000035L, HomeTeamScore = 6L, AwayTeamScore = 3L,
TotalScore = 10L, PregameOdds = list(), LiveOdds = list(
list(GameOddId = 385764L, Sportsbook = NULL, GameId = 14494L,
Created = "2019-04-15T00:24:40", Updated = "2019-04-15T00:54:53",
HomeMoneyLine = NULL, AwayMoneyLine = NULL, HomePointSpread = -2.8,
AwayPointSpread = 2.8, HomePointSpreadPayout = 129L,
AwayPointSpreadPayout = -149L, OverUnder = 10.7,
OverPayout = 126L, UnderPayout = -145L))))
Upvotes: 0
Views: 45
Reputation: 24089
Since your sample data looks pretty straight forward. Each object in the primary list only contains a single unique list. You could unlist each object, convert to a data.frame and then bind them all together.
Assuming your data is named "jsonRespParsed":
games<-lapply(jsonRespParsed, function(game){data.frame(t(unlist(game)))})
library(dplyr)
answer<-bind_rows(games)
answer
#GameId Season SeasonType Day DateTime Status AwayTeamId HomeTeamId AwayTeamName
#1 14491 2019 3 2019-04-14T00:00:00 2019-04-14T12:00:00 Final 11 14 NYI
#2 14492 2019 3 2019-04-14T00:00:00 2019-04-14T19:00:00 Final 6 16 TB
#3 14493 2019 3 2019-04-14T00:00:00 2019-04-14T19:30:00 Final 22 20 WPG
#4 14494 2019 3 2019-04-14T00:00:00 2019-04-14T22:00:00 Final 27 35 SJ
Upvotes: 1