robertspierre
robertspierre

Reputation: 4431

Convert a specific warning type into an error

Consider the following code:

> df <- tibble(gender=c(1,1,0))
> df$male
Warning: Unknown or uninitialised column: `male`.
NULL

How can I convert this specific warning type into an error?

I would like something like options(warn = 2) but only for this specific type of warning (i.e. refering to a column in a tibble that doesn't exist)

Upvotes: 2

Views: 135

Answers (5)

Roland
Roland

Reputation: 132969

Based on the idea in @BenBolker's answer, you can create a global handler:

globalCallingHandlers(warning = function(w) {
  if(grepl("Unknown or uninitialised column", w$message, fixed = TRUE)) 
    stop(errorCondition(w$message, class = "unknown_column_extract_error")) else w
  })


library(tibble)
df <- tibble(gender=c(1,1,0))
df$male
#Error: Unknown or uninitialised column: `male`.

#doesn't affect other warnings
as.numeric("a")
#[1] NA
#Warning message:
#  NAs introduced by coercion 

Of course, this would be simpler if the warning was classed.

Upvotes: 3

user2554330
user2554330

Reputation: 44957

Others have given you workarounds for this issue. In this answer I'll explain what the tibble authors could have done to make the job easier.

R has support for classed errors and warnings, but they are not used much. The rlang package has the class argument to each of abort(), warn() and inform(). You don't need to use rlang, it can all be done with base functions, but they package it nicely.

So what tibble could have done is set a separate class on every warning they issue. Then you could catch the particular class corresponding to the warning that interests you, and convert it into an error. Or you could make your own $.tbl_df method that does this, and then choose whether or not you want the conversion.

For example, instead of

warn(paste0("Unknown or uninitialised column: ", tick(name), 
        "."))

they could have written

warn(paste0("Unknown or uninitialised column: ", tick(name), 
        "."), class = "warn.$.tbl_df")

and then you could run

tryCatch(df$male, `warn.$.tbl_df` = function(w) abort(w$message, class = "error.$.tbl_df", call = NULL))

to catch that warning, and convert it to an error. I converted to a classed error with class "error.$.tbl_df" just so some other code could catch that, but there's no need for that. I also set call = NULL so you don't get told about a line in the code of $.tbl_df that would mean nothing to you. You might be able to find the df$male expression if you play around with the setting; I'm not sure how far back it would be.

Upvotes: 2

random_walk
random_walk

Reputation: 93

The original function for the S3 method $.tbl_df is defined as:

function (x, name) 
{
    out <- .subset2(x, name)
    if (is.null(out)) {
        warn(paste0("Unknown or uninitialised column: ", tick(name), 
            "."))
    }
    out
}

You can just simply replace the warning with an error using stop or rlang:abort():

`$.tbl_df` <- function (x, name) {
  out <- .subset2(x, name)
  if (is.null(out)) {
    rlang::abort(paste0("Unknown or uninitialised column: ", name, "."))
  }
  out
}

Then, as expected:

> df$male
Error in `$.tbl_df`(df, male) : Unknown or uninitialised column: male.

Addendum

As pointed out in the comments, this solution may cause problems if $.tbl_df is being called internally. Following @Billy34's lead, we can replace the internal $.tbl_df function with our modified function like so:

patch_tibble <- function(){
  
  # The modified "$.tbl_df" function
  f <- function (x, name) {
    out <- .subset2(x, name)
    if (is.null(out)) {
      abort(paste0("Unknown or uninitialised column: ", tick(name), 
                   "."))
    }
    out
  }
  
  # Set the function's namespace to tibble
  environment(f) <- asNamespace("tibble")
  
  # Replace the original function with our modified version
  utils::assignInNamespace("$.tbl_df", f, "tibble")
  
  print("Patched tibble!")
}

This yields the output:

> patch_tibble()
[1] "Patched tibble!"
> df$male
Error in `df$male`:
! Unknown or uninitialised column: `male`.

Upvotes: 3

Billy34
Billy34

Reputation: 2214

It's a bit hacky but you can patch the function $ i.e. the getter. The warning is thrown by rlang::warn when the column name is not found in df. By substituting this call with one to rlang::abort we transform this warning into an error.

patch_tibble <- function() {
  if(is.null(attr(tibble:::`$.tbl_df`, "patched"))) {
    tt <- get("$.tbl_df", envir=asNamespace("tibble"), inherits=FALSE)
    
    body(tt) <- methods::substituteDirect(body(tt), list(warn=quote(abort)))
    attr(tt, "patched") <- TRUE
    
    unlockBinding("$.tbl_df", asNamespace("tibble"))
    assign("$.tbl_df", tt, envir=asNamespace("tibble"), inherits = FALSE)
    lockBinding("$.tbl_df", asNamespace("tibble"))
    packageStartupMessage("Patched tibble!")
  }
}
library(tibble)
patch_tibble()

df <- tibble(gender=c(1,1,0))
df$male

Error in `df$male`:
! Unknown or uninitialised column: `male`.
Run `rlang::last_trace()` to see where the error occurred.

Upvotes: 4

Ben Bolker
Ben Bolker

Reputation: 226712

This might be the closest you'll get (without hacking as in the previous answer): use a tryCatch() that checks for the warning message (this will fail if someone uses different language settings where the warning has been translated).

my_try <- function(expr, warnstr = "Unknown or uninitialised column") { 
   tryCatch(expr,
      warning = function(e) { 
         if(grepl(warnstr, e)) simpleError(e) })
}
my_try(df$male)

Upvotes: 2

Related Questions