Reputation: 3022
I have an Oracle database which is refreshed once a day. I am a bit confused on how apps work in Shiny, what gets run once on app startup - and what gets run once per session.
My naive approach was to create a database connection and run a query outside of UI and Server code to create a dataframe of around 600,000 records...which can then be filtered and sliced during the session. I am a bit concerned by doing it inside app.R in global scope, that this connection and dataframe will only be created once when the server starts the app, and will never get run again (if that makes sense).
If I create the data frame in server, then my UI code fails, as is is dependent on the results of a query to populate the select list, and I do this in app.R scope at the moment, so UI can access it.
library(shiny)
library(DBI)
library(dplyr)
library(odbc)
library(stringdist)
library(reactable)
############################################################################
# business functions #
############################################################################
get_list_of_actives_from_db <- function() {
con <- dbConnect(odbc::odbc(), Driver="oracle", Host = "server.mycompany.net", Port = "1521", SVC = "service1", UID = "user_01", PWD = "hello", timeout = 10)
ingredients_df = dbGetQuery(con,
'
select DISTINCT INGREDIENTS FROM AES
'
)
}
get_adverse_events_from_db <- function() {
con <- dbConnect(odbc::odbc(), Driver="oracle", Host = "server.mycompany.net", Port = "1521", SVC = "service1", UID = "user_01", PWD = "hello", timeout = 10)
cases_df = dbGetQuery(con,
'
select * FROM AES
'
)
return(cases_df)
}
############################################################################
# load data sets for use in dashboard #
############################################################################
cases_df = get_adverse_events_from_db() # drive select list in UI
ingredients_df = get_list_of_actives_from_db() # main data to slice and filter
############################################################################
# shiny UI #
############################################################################
ui <- fluidPage(
"Adverse Event Fuzzy Search Tool",
fluidRow(
selectInput("ingredients", label = "Select on or more Active Ingredients:", choices = ingredients_df$PRIMARY_SUSPECT_KEY_INGREDIENT, multi=TRUE),
textInput("search_term", "AE Search Term:"),
actionButton("do_search", "Perform Search")
)
,
fluidRow(
reactableOutput("search_results")
)
)
############################################################################
# shiny server #
############################################################################
server <- function(input, output, session) {
# do stuff here to filter the data frame based on the selected value and render a table
}
# Run the application
shinyApp(ui = ui, server = server)
My main concern is doing this in the root of app.R, both functions run oracle queries which never need to be re-run for the session, as the data will only change overnight via ETL.
############################################################################
# load data sets for use in dashboard #
############################################################################
cases_df = get_adverse_events_from_db()
ingredients_df = get_list_of_actives_from_db()
When and how often is this called? Once when the app is initialized so the data set is never updated and is shared across sessions by users? Or is the entire script run end to end whenever a new sessions is started?
Part of me thinks it should be in the server function, so it runs once per session. But being new to Shiny I feel like server is called constantly whenever there is a change in the UI, I dont want to be constantly loading 600,000 records from Oracle.
Ideally I would cache the results once a day and make them available to all users across all sessions, not sure how to achieve that - so for now just want to know the best way to achieve this, so each user runs the query once and has the data frame cached for the session.
Upvotes: 2
Views: 472
Reputation: 33417
Please check RStudio's article Scoping rules for Shiny apps in this context.
If I got you right, you are asking to share a dataset across shiny-sessions and update it daily (The title of the question didn't really fit your explanation of the problem - I edited it).
I'd suggest using a cross-session reactivePoll
to avoid unnecessary DB queries (I once asked a similar question here - Over there I gave an example showing, that the same can be achived via reactiveValues but it's more complex).
Here is the simple pattern you can use - please note that reactivePoll
is defined outside the server function so all sessions share the same data:
library(shiny)
ui <- fluidPage(textOutput("my_db_data"))
updated_db_data <- reactivePoll(
intervalMillis = 1000L*60L*5L, # check for a new day every 5 minutes
session = NULL,
checkFunc = function() {
print(paste("Running checkFunc:", Sys.time()))
Sys.Date()
},
valueFunc = function() {
# your db query goes here:
paste("Latests DB update:", Sys.time())
}
)
server <- function(input, output, session) {
output$my_db_data <- renderText(updated_db_data())
}
shinyApp(ui, server)
Here, every 5 minutes the checkFunc
checks for a new day - valueFunc
is executed only if the result of checkFunc
changed. As a (real world) alternative for checkFunc
you could implement a query to check for the number of rows of a certain DB table.
PS: There is an example given on a cross-session reactiveFileReader
(which is based on reactivePoll
) when viewing ?reactiveFileReader
PPS: When doing further filtering etc. on that dataset also check bindCache().
Upvotes: 2
Reputation: 160437
While untested, perhaps this architecture will work:
server <- function(input, output, session) {
dailydata_ <- reactiveValues(when = NULL, what = NULL)
dailydata <- reactive({
oldwhen <- dailydata_$when
if (is.null(oldwhen) ||
as.Date(oldwhen) < Sys.Date()) {
newdata <- tryCatch(
DBI::dbGetQuery(con, "..."),
error = function(e) e)
if (inherits(newdata, "error")) {
warning("error retrieving new data: ", conditionMessage(e))
warning("using stale data instead")
} else {
dailydata_$when <- Sys.time()
dailydata_$what <- newdata
}
}
dailydata_$what
})
# some consumer of the real data
output$tbl <- renderTable(dailydata())
}
The advantage to this is that it's re-query will trigger when the data was retrieved on a different day. Granted, when the new ETL is available might change how exactly this conditional is fashioned, it might be that if it is updated at (say) 2am, then you may need some more time-math to determine if the current data is before or after the most recent update.
This logic has a "data available" fail: if it could not be queried, then the current/stale data is re-used. If you prefer that it returns no data, that is easy enough to change in the code.
(One thing you might want to do is to show the user when the data was last retrieved; this can be retrieved directly with dailydata_$when
, accepting that it might be NULL
.)
Upvotes: 0