Reputation: 1325
I have a data frame, one column of which contains more data frames. One of those data frames is missing a column. I would like to remove that column from the other two data frames, if it exists.
Here's an example:
library(tidyverse)
mtcars %>%
group_by(cyl) %>%
nest -> tmp
tmp[3,'data'][[1]][[1]] <- dplyr::select(tmp[3,'data'][[1]][[1]], -mpg)
print(tmp)
# A tibble: 3 x 2 cyl data <dbl> <list> 1 6. <tibble [7 × 10]> 2 4. <tibble [11 × 10]> 3 8. <tibble [14 × 9]>
So here, the data
column contains three tibbles, the last of which doesn't contain the column mpg
. I can map dplyr::select over the data column, and catch errors by returning NA
as follows:
tmp %>% mutate(data2 = map(data, possibly(~dplyr::select(.,-mpg), otherwise = NA)))
# A tibble: 3 x 3 cyl data data2 <dbl> <list> <list> 1 6. <tibble [7 × 10]> <tibble [7 × 9]> 2 4. <tibble [11 × 10]> <tibble [11 × 9]> 3 8. <tibble [14 × 9]> <lgl [1]>
But what I'd really rather do is return the input data. Something like:
tmp %>% mutate(data2 = map(data, possibly(~dplyr::select(.,-mpg), otherwise = function(x){x})))
# A tibble: 3 x 3 cyl data data2 <dbl> <list> <list> 1 6. <tibble [7 × 10]> <tibble [7 × 9]> 2 4. <tibble [11 × 10]> <tibble [11 × 9]> 3 8. <tibble [14 × 9]> <fn>
But of course this just returns function(x){x}
as the last row of data2
.
Any clever ideas? Or is this the kind of situation where I just need to use tryCatch
or otherwise handle errors more explicitly?
Upvotes: 3
Views: 327
Reputation: 215047
The otherwise
argument in possibly
is a constant, so it can't change with the input of the wrapped function; What you might do is wrap possibly
in another function that can access element from data
, set the element as otherwise
:
my_select <- function(x) {
f = possibly(function() select(x, -mpg), otherwise = x)
f()
}
tmp %>% mutate(data2 = map(data, my_select))
# A tibble: 3 x 3
# cyl data data2
# <dbl> <list> <list>
#1 6.00 <tibble [7 x 10]> <tibble [7 x 9]>
#2 4.00 <tibble [11 x 10]> <tibble [11 x 9]>
#3 8.00 <tibble [14 x 9]> <tibble [14 x 9]>
Or take the formula form:
tmp %>%
mutate(data2 = map(data, ~ (invoke(possibly(function() select(.,-mpg), otherwise = .)))))
# A tibble: 3 x 3
# cyl data data2
# <dbl> <list> <list>
#1 6 <tibble [7 × 10]> <tibble [7 × 9]>
#2 4 <tibble [11 × 10]> <tibble [11 × 9]>
#3 8 <tibble [14 × 9]> <tibble [14 × 9]>
Upvotes: 5