DSH
DSH

Reputation: 427

R call dplyr 1.0.0 filter() and across() within purrr::map()

Here is code that reads data from a remote URL and then finds rows that are NA for all columns within five different subsets of columns. The output table miss_recode gives the ID for those rows along with the variable recode_cols, which is the label for which subset of rows is all NA.

suppressMessages(library(tidyverse))

urlRemote_path  <- "https://raw.githubusercontent.com/"
github_path <- "DSHerzberg/TOD-R/master/INPUT-FILES/"
fileName_path   <- "SO-data.csv"

input <- suppressMessages(read_csv(url(
  str_c(urlRemote_path, github_path, fileName_path)
)))

miss1 <- input %>%
  filter(across(c(i001:i035),
                ~ is.na(.))) %>% 
  mutate(recode_cols = "i001:i035")
miss2 <- input %>%
  filter(across(c(i036:i060),
                ~ is.na(.))) %>% 
  mutate(recode_cols = "i036:i060")
miss3 <- input %>%
  filter(across(c(i061:i100),
                ~ is.na(.))) %>% 
  mutate(recode_cols = "i061:i100")
miss4 <- input %>%
  filter(across(c(i101:i130),
                ~ is.na(.))) %>% 
  mutate(recode_cols = "i101:i130")
miss5 <- input %>%
  filter(across(c(i131:i165),
                ~ is.na(.))) %>% 
  mutate(recode_cols = "i131:i165")

miss_recode <- bind_rows(
  miss1, 
  miss2, 
  miss3, 
  miss4, 
  miss5
) %>% 
  select(ID, recode_cols)

I want to consolidate the code with purrr::map. The next snippet shows my attempt, but it returns Error: Can't subset columns that don't exist.

vec <- c("i001:i035", "i036:i060", "i061:i100", "i101:i130", "i131:i165")

miss_recode_map <- vec %>% 
  map_df(~
        input %>%
        filter(across(c(!!sym(.x)),
                      ~ is.na(.))) %>% 
        mutate(recode_cols = .x) %>% 
          select(ID, recode_cols)
        )

Clearly I'm not getting the NSE right. This seems like a new question related to across() which is now available in dplyr 1.0.0. In this instance, it seems like one usage of .x requires the elements of vec to be quoted, and the other usage of .x requires those elements to be unquoted.

Thanks in advance for any help.

Upvotes: 2

Views: 491

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388817

You cannot convert "i001:i035" as symbol, you actually need to parse it.

library(dplyr)
library(rlang)

purrr::map_df(vec, ~input %>%
                    filter(across(!!parse_expr(.x),~ is.na(.))) %>%
                    mutate(recode_cols = .x) %>%
                    select(ID, recode_cols))

# A tibble: 8 x 2
#      ID recode_cols
#   <dbl> <chr>      
#1 201010 i036:i060  
#2 214063 i036:i060  
#3 262050 i036:i060  
#4 262063 i036:i060  
#5 205036 i061:i100  
#6 231007 i061:i100  
#7 208014 i101:i130  
#8 231014 i131:i165  

Upvotes: 2

Related Questions