bodega18
bodega18

Reputation: 654

Shinyapps.io Converting All Numeric Data to NA

I am trying to render a table on shinyapps.io, but it is populating with all NA's. I am scraping NCAA basketball spreads from https://www.vegasinsider.com/college-basketball/odds/las-vegas/. Locally, the table renders fine. But on shinyapps.io, all the numeric spreads display as NA's. It only displays correctly on shinyapps.io if all the spread values are characters. But then I cannot perform any math operations. As soon as the BetMGM, Caesers, FanDuel columns are numeric, they display with NA. I'll provide some code and data to help recreate the issue. There was a lot of data cleaning steps that I will skip for the sake of brevity.

@akrun here is the code to scrape the table. I do this and then some regex to split apart the game_info into components.

# Table Scraping Code

url <- read_html("https://www.vegasinsider.com/college-basketball/odds/las-vegas/")

spread_table <- url %>% html_table(fill = TRUE)

spread_table <- spread_table[[8]]


spread_table <- spread_table %>%
  rename(game_info = X1,
         VegasInsiderOpen = X2,
         BetMGM = X3,
         Caesers = X4,
         Circa = X5,
         FanDuel = X6,
         DraftKings = X7,
         PointsBet = X8,
         SuperBook = X9,
         VegasInsiderConsensus = X10)



# A tibble: 8 × 15 (spread_table)

 date   time      away_team_name  home_team_name  BetMGM  Caesers  FanDuel 
<chr>   <chr>         <chr>          <chr>        <dbl>   <dbl>    <dbl>  
 12/23  7:00 PM   George Mason    Wisconsin       -11.5   -11.5    -11.5
 12/23  4:00 PM   Liberty         Stanford        -1.5    -2.0     -2.0
 12/23  10:00 PM  BYU             Vanderbilt       4.0     5.5      5.5
 12/24  12:00 AM  South Florida   Hawaii          -4      -3.5      NA

An extremely simplified version of the Shiny app:

ui <- fluidPage(
  titlePanel("NCAAB Spreads App"),
  tableOutput("upcoming_games")
)

server <- function(input, output, session) {

  output$upcoming_games <- renderTable({

    spread_table

  })

}

shinyApp(ui, server)

@akrun

enter image description here

Syracuse, Xavier, Ball State, Notre Dame, Boise State, St Marys are the favored teams in this subset. But there is no telling that from the dataframe I am getting from your code.

enter image description here

Here is the dataframe below @jpdugo17 so it is not lost

structure(list(date = c("12/27", "12/28", "12/28", "12/28", 
"12/28", 
"12/28"), time = c("6:00 PM", "7:00 PM", "8:00 PM", "8:00 PM", 
"9:00 PM", "10:00 PM"), away_team_name = c("Brown", 
"Connecticut", 
"Ball State", "Notre Dame", "Fresno State", "Yale"),         
home_team_name = c("Syracuse", 
"Xavier", "Northern Illinois", "Pittsburgh", "Boise State", 
"St. Marys (CA)"
), VegasInsiderOpen = c(-10.5, -3, -3, -6, -4, -12.5), BetMGM = 
c(-9.5, 
NA, NA, NA, NA, NA), Caesers = c(-10, NA, NA, -3.5, -4, -13), 
    Circa = c(-9.5, NA, NA, NA, NA, NA), FanDuel = c(NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), 
DraftKings = c(-9.5, 
    -3, -2, -3.5, -3.5, -12.5), PointsBet = c(NA_real_, 
NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_), SuperBook = 
c(-9.5, 
    NA, NA, -4, -4, -13), VegasInsiderConsensus = c(-9.5, -3, 
    -2, -4, -4, -13)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L))

Upvotes: 2

Views: 93

Answers (1)

akrun
akrun

Reputation: 887088

It seems that the spread_table after scraping may be post-processed in a way that couldn't convert the extracted substring into numeric class - i.e. when we do as.numeric, if there is any character, it may convert to NA.

In the below code, select the columns of interest after scraping, then extract the substring from the 'game_info' column to split into 'date', 'time', 'away_team_name' and 'home_team_name' based on a regex pattern matching and capturing ((...)) those groups that meet the criteria. (^(\\S+)) - captures the first group as one or more non white spaces characters from the start (^) of the string, followed by one or more white space (\\s+), then capture characters that are not newline character (([^\n]+)) followed by any character that is not letter ([^A-Za-z]+), capture third groups as one or more characters not the newline followed by again the characters not a letter and capture the rest of the characters ((.*)). Then loop across the 'BetMGM' to 'FanDuel', extract the substring characters not having u or - and is followed by a space ((?=\\s)), replace the substring fraction with + 0.5 (as there was only a single fraction), loop over the string and evalutate the string

library(dplyr)
library(tidyr)
library(purrr)
spread_table1 <- spread_table %>%
   dplyr::select(game_info, BetMGM, Caesers, FanDuel) %>% 
   tidyr::extract(game_info, into = c("date", "time", "away_team_name", 
    "home_team_name"), "^(\\S+)\\s+([^\n]+)[^A-Za-z]+([^\n]+)[^A-Za-z]+(.*)")  %>% 
   dplyr::mutate(across(BetMGM:FanDuel, ~
    purrr::map_dbl(stringr::str_replace(str_extract(., "-?[^-u]+(?=\\s)"), 
           "(\\d+)½", "(\\1 + 0.5)"), ~ eval(parse(text = .x)))))

Upvotes: 2

Related Questions