Cristhian
Cristhian

Reputation: 371

Adding column if it does not exist inside purrr language

I've been struggling trying to add a new column if it does not exist. I found the answer in here: Adding column if it does not exist .

However, in my problem I must use it inside purrr environment. I tried to adapt the above answer, but it doesn't fit my needs.

Here is an example what I'm dealing with:

Suppose I have a list of two data.frames:

library(tibble)

A = tibble(
  x = 1:5, y = 1, z = 2
)

B = tibble(
  x = 5:1, y = 3, z = 3, w = 7
)

dt_list = list(A, B)

The column I'd like to add is w:

cols = c(w = NA_real_)

Separately, if I want to add a column if it does not exist, I could do the following:

Since it does exist, not columns is added:

B %>% tibble::add_column(!!!cols[!names(cols) %in% names(.)])

# A tibble: 5 x 4
      x     y     z     w
  <int> <dbl> <dbl> <dbl>
1     5     3     3     7
2     4     3     3     7
3     3     3     3     7
4     2     3     3     7
5     1     3     3     7

In this case, since it does not exist, w is added:

A %>% tibble::add_column(!!!cols[!names(cols) %in% names(.)])

# A tibble: 5 x 4
      x     y     z     w
  <int> <dbl> <dbl> <dbl>
1     1     1     2    NA
2     2     1     2    NA
3     3     1     2    NA
4     4     1     2    NA
5     5     1     2    NA

I tried the following to replicate it using purrr (I'd prefer not to use a for loop):

dt_list_2 = dt_list %>% 
  purrr::map(
    ~dplyr::select(., -starts_with("x")) %>% 
      ~tibble::add_column(!!!cols[!names(cols) %in% names(.)])
  )

But the output is not the same as doing it separately.

Note: This is an example of my real problem. In fact, I'm using purrr to read many *.csv files and then apply some data transformation. Something like this:

re_file <- list.files(path = dir_path, pattern = "*.csv")

cols_add = c(UCI = NA_real_)

file_list = re_file %>%
  purrr::map(function(file_name){ # iterate through each file name
    
    read_csv(file = paste0(dir_path, "//",file_name), skip = 2)
  }) %>% 
   purrr::map(
     ~dplyr::select(., -starts_with("Textbox")) %>% 
       ~dplyr::tibble(!!!cols[!names(cols) %in% names(.)])
  )

Upvotes: 2

Views: 642

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389145

You can use :

dt_list %>% 
  purrr::map(
    ~tibble::add_column(., !!!cols[!names(cols) %in% names(.)])
  )

#[[1]]
# A tibble: 5 x 4
#     x     y     z     w
#  <int> <dbl> <dbl> <dbl>
#1     1     1     2    NA
#2     2     1     2    NA
#3     3     1     2    NA
#4     4     1     2    NA
#5     5     1     2    NA

#[[2]]
# A tibble: 5 x 4
#      x     y     z     w
#  <int> <dbl> <dbl> <dbl>
#1     5     3     3     7
#2     4     3     3     7
#3     3     3     3     7
#4     2     3     3     7
#5     1     3     3     7

Upvotes: 2

Related Questions