Reputation: 371
I've been struggling trying to add a new column if it does not exist. I found the answer in here: Adding column if it does not exist .
However, in my problem I must use it inside purrr
environment. I tried to adapt the above answer, but it doesn't fit my needs.
Here is an example what I'm dealing with:
Suppose I have a list of two data.frames
:
library(tibble)
A = tibble(
x = 1:5, y = 1, z = 2
)
B = tibble(
x = 5:1, y = 3, z = 3, w = 7
)
dt_list = list(A, B)
The column I'd like to add is w
:
cols = c(w = NA_real_)
Separately, if I want to add a column if it does not exist, I could do the following:
Since it does exist, not columns is added:
B %>% tibble::add_column(!!!cols[!names(cols) %in% names(.)])
# A tibble: 5 x 4
x y z w
<int> <dbl> <dbl> <dbl>
1 5 3 3 7
2 4 3 3 7
3 3 3 3 7
4 2 3 3 7
5 1 3 3 7
In this case, since it does not exist, w
is added:
A %>% tibble::add_column(!!!cols[!names(cols) %in% names(.)])
# A tibble: 5 x 4
x y z w
<int> <dbl> <dbl> <dbl>
1 1 1 2 NA
2 2 1 2 NA
3 3 1 2 NA
4 4 1 2 NA
5 5 1 2 NA
I tried the following to replicate it using purrr
(I'd prefer not to use a for loop):
dt_list_2 = dt_list %>%
purrr::map(
~dplyr::select(., -starts_with("x")) %>%
~tibble::add_column(!!!cols[!names(cols) %in% names(.)])
)
But the output is not the same as doing it separately.
Note: This is an example of my real problem. In fact, I'm using purrr
to read many *.csv files and then apply some data transformation. Something like this:
re_file <- list.files(path = dir_path, pattern = "*.csv")
cols_add = c(UCI = NA_real_)
file_list = re_file %>%
purrr::map(function(file_name){ # iterate through each file name
read_csv(file = paste0(dir_path, "//",file_name), skip = 2)
}) %>%
purrr::map(
~dplyr::select(., -starts_with("Textbox")) %>%
~dplyr::tibble(!!!cols[!names(cols) %in% names(.)])
)
Upvotes: 2
Views: 642
Reputation: 389145
You can use :
dt_list %>%
purrr::map(
~tibble::add_column(., !!!cols[!names(cols) %in% names(.)])
)
#[[1]]
# A tibble: 5 x 4
# x y z w
# <int> <dbl> <dbl> <dbl>
#1 1 1 2 NA
#2 2 1 2 NA
#3 3 1 2 NA
#4 4 1 2 NA
#5 5 1 2 NA
#[[2]]
# A tibble: 5 x 4
# x y z w
# <int> <dbl> <dbl> <dbl>
#1 5 3 3 7
#2 4 3 3 7
#3 3 3 3 7
#4 2 3 3 7
#5 1 3 3 7
Upvotes: 2