Wasabi
Wasabi

Reputation: 3071

Package wrapping "odbc" doesn't work with a global connection

I have a package pkg which wraps around the odbc package to simplify my life. The package is composed of a single code file sql_con.R:

# sql_con.R
getQuery <- function(sql) {
  con <- odbc::dbConnect(odbc::odbc(),
                   Driver = "SQL Server",
                   Server = "Foo",
                   UID = "user",
                   PWD = "password")
  return(odbc::dbGetQuery(con, sql))
}

# DESCRIPTION
Package: NCHUtils
Title: What the Package Does (One Line, Title Case)
Version: 0.0.0.9000
Authors@R: 
    person(given = "First",
           family = "Last",
           role = c("aut", "cre"),
           email = "[email protected]",
           comment = c(ORCID = "YOUR-ORCID-ID"))
Description: What the package does (one paragraph).
License: What license it uses
Encoding: UTF-8
LazyData: true
Depends: 
    odbc

# NAMESPACE
exportPattern("^[^\\.]")
importMethodsFrom(odbc, dbGetQuery, dbConnect)

Building this package works correctly, library(pkg) followed by some SQL call using getQuery() gives good results.

However, having to create a new connection with each query is a bit silly, so I want to make con a global variable which is only created once and reused every time. (This will be later improved upon using the pool package). I've also put the connection in an environment to get over any binding locking issues (following the suggestion in this R-bloggers post).

# sql_con.R version 2.0
pkg <- new.env()

pkg$con <- odbc::dbConnect(odbc::odbc(),
                   Driver = "SQL Server",
                   Server = "Foo",
                   UID = "user",
                   PWD = "password")

getQuery <- function(sql) {
  return(odbc::dbGetQuery(pkg$con, sql))
}

Building the package is once again successful.

I then run the following tests:

library(pkg)

pkg::pkg
# <environment: 0x000001a9dbcf8f88>

pkg::pkg$con
# An object of class "Microsoft SQL Server"
# [a bunch of attributes...]

pkg::getQuery("SELECT * FROM Foo")
# Error in (function (classes, fdef, mtable)  : 
#   unable to find an inherited method for function ‘dbGetQuery’ for signature ‘"Microsoft SQL Server", "character"’

Removing the use of an environment changes nothing. I've already set DESCRIPTION to use Depends: instead of the common Imports: in an attempt to fix this (the first version worked with Imports:), and am (afaik) importing the function correctly with importMethodsFrom in NAMESPACE.

I've noticed odbc creates the inherited function with the "Microsoft SQL Server" signature only once dbGetQuery() is actually called with such an argument. But I can't see how that'd be relevant.

Is there any reason why making the connection global breaks the package?

Upvotes: 4

Views: 1539

Answers (1)

Alexis
Alexis

Reputation: 5059

I don't have an ODBC connection right now, but I think I know what the problem is.

I came across this thread, which I suggest you read through because you might need something similar to clean your connections, but the key insight is:

Note that when you install a package, R runs all the code in the package and only stores the results of the code in the installed package. So if you create an object outside of a function in your package, then only the object will be stored in the package, but not the code that creates it. The object will be simply loaded when you load the package, but it will not be re-created.

That means that your package is trying to use a DB connection that was serialized and subsequently deserialized, which can't possibly work. If you wanted to manage it entirely by yourself, you'd probably need something like:

conn_provider <- with(new.env(), {
    conn <- NULL

    function() {
        if (is.null(conn)) {
            conn <<- DBI::dbConnect(your_details)
        }

        conn
    }
})

And a reg.finalizer call like explained in the linked thread.

Regarding package imports, from what I can gather, the generics are defined by DBI, and different packages build upon it, so I think you're better off importing the generics from DBI and the methods from odbc. If you use roxygen2 (which you definitely should), you'd need something along the lines of:

#' @importFrom DBI dbGetQuery
#' @importMethodsFrom odbc dbGetQuery
#'
getQuery <- function(sql) {
  DBI::dbGetQuery(conn_provider(), sql)
}

And similarly for all your other functions. I think that this way, the calls to DBI generics in your package can always dispatch to the specific methods defined by any package that builds upon DBI, not just odbc ones.

Upvotes: 4

Related Questions