ohnoplus
ohnoplus

Reputation: 1325

return the mapped object if expression inside of purrr:possibly() fails

I have a data frame, one column of which contains more data frames. One of those data frames is missing a column. I would like to remove that column from the other two data frames, if it exists.

Here's an example:

library(tidyverse)

mtcars %>%
group_by(cyl) %>%
nest -> tmp
tmp[3,'data'][[1]][[1]] <- dplyr::select(tmp[3,'data'][[1]][[1]], -mpg)

print(tmp)
# A tibble: 3 x 2
    cyl data              
  <dbl> <list>            
1    6. <tibble [7 × 10]> 
2    4. <tibble [11 × 10]>
3    8. <tibble [14 × 9]>

So here, the data column contains three tibbles, the last of which doesn't contain the column mpg. I can map dplyr::select over the data column, and catch errors by returning NA as follows:

 tmp %>% mutate(data2 = map(data, possibly(~dplyr::select(.,-mpg), otherwise = NA)))
# A tibble: 3 x 3
    cyl data               data2            
  <dbl> <list>             <list>           
1    6. <tibble [7 × 10]>  <tibble [7 × 9]> 
2    4. <tibble [11 × 10]> <tibble [11 × 9]>
3    8. <tibble [14 × 9]>  <lgl [1]>

But what I'd really rather do is return the input data. Something like:

 tmp %>% mutate(data2 = map(data, possibly(~dplyr::select(.,-mpg), otherwise = function(x){x})))
# A tibble: 3 x 3
    cyl data               data2            
  <dbl> <list>             <list>           
1    6. <tibble [7 × 10]>  <tibble [7 × 9]> 
2    4. <tibble [11 × 10]> <tibble [11 × 9]>
3    8. <tibble [14 × 9]>  <fn>

But of course this just returns function(x){x} as the last row of data2.

Any clever ideas? Or is this the kind of situation where I just need to use tryCatch or otherwise handle errors more explicitly?

Upvotes: 3

Views: 327

Answers (1)

akuiper
akuiper

Reputation: 215047

The otherwise argument in possibly is a constant, so it can't change with the input of the wrapped function; What you might do is wrap possibly in another function that can access element from data, set the element as otherwise:

my_select <- function(x) {
    f = possibly(function() select(x, -mpg), otherwise = x)
    f()
}

tmp %>% mutate(data2 = map(data, my_select))
# A tibble: 3 x 3
#    cyl data               data2            
#  <dbl> <list>             <list>           
#1  6.00 <tibble [7 x 10]>  <tibble [7 x 9]> 
#2  4.00 <tibble [11 x 10]> <tibble [11 x 9]>
#3  8.00 <tibble [14 x 9]>  <tibble [14 x 9]>

Or take the formula form:

tmp %>% 
    mutate(data2 = map(data, ~ (invoke(possibly(function() select(.,-mpg), otherwise = .)))))

# A tibble: 3 x 3
#    cyl data               data2            
#  <dbl> <list>             <list>           
#1     6 <tibble [7 × 10]>  <tibble [7 × 9]> 
#2     4 <tibble [11 × 10]> <tibble [11 × 9]>
#3     8 <tibble [14 × 9]>  <tibble [14 × 9]>

Upvotes: 5

Related Questions