Reputation: 2301
I have a dataset df that looks like this:
and another two datasets that look like this:
Dataset A:
Dataset B:
I want to build a data C<-df %>% keep (name(df) %in% A$Data)
and then check if B$Data show up in C or not. what should I do?
Sample data can be get from:
df<-structure(list(cy3.CSV = c("cy3.CSV", NA, NA), cy6.CSV = c("cy1.CSV",
"cy24.CSV", "cy6.CSV"), dlt.CSV = c("dlt.CSV", NA, NA), dm.CSV = c("dm.CSV",
NA, NA), dov.CSV = c("dov.CSV", "dov_1.CSV", NA), dov_1.CSV = c("dov_1.CSV",
NA, NA), ds.CSV = c("ds.CSV", "ds_1.CSV", NA)), row.names = c(NA,
-3L), class = c("tbl_df", "tbl", "data.frame"))
A<-structure(list(Data = c("cy6.CSV", "dov.CSV", "ds.CSV"), match = c(3,
2, 2)), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"
))
B<-structure(list(Data = c("cy3.CSV", "dlt.CSV", "dm.CSV", "dov_1.CSV"
), match = c(1, 1, 1, 1)), row.names = c(NA, -4L), class = c("tbl_df",
"tbl", "data.frame"))
Thanks.
Upvotes: 0
Views: 60
Reputation: 887088
We can place the dataset in a list
and then do it in a loop
library(dplyr)
library(purrr)
map(list(A, B), ~ .x %>%
mutate(in_df = Data %in% union(colnames(df), unlist(df))))
-output
#[[1]]
# A tibble: 3 x 3
# Data match in_df
# <chr> <dbl> <lgl>
#1 cy6.CSV 3 TRUE
#2 dov.CSV 2 TRUE
#3 ds.CSV 2 TRUE
#[[2]]
# A tibble: 4 x 3
# Data match in_df
# <chr> <dbl> <lgl>
#1 cy3.CSV 1 TRUE
#2 dlt.CSV 1 TRUE
#3 dm.CSV 1 TRUE
#4 dov_1.CSV 1 TRUE
Based on the comments, may be this helps
C <- df %>%
select_if(~ any(. %in% A$Data))
# dplyr version >= 1.0
C %>%
mutate(across(everything(), ~ replace(., is.na(.), intersect(B$Data, .)[1])))
# A tibble: 3 x 3
# cy6.CSV dov.CSV ds.CSV
# <chr> <chr> <chr>
#1 cy1.CSV dov.CSV ds.CSV
#2 cy24.CSV dov_1.CSV ds_1.CSV
#3 cy6.CSV dov_1.CSV <NA>
# dplyr version < 1.0
C %>%
mutate_all( ~ replace(., is.na(.), intersect(B$Data, .)[1]))
Upvotes: 3
Reputation: 17715
Convert your df
to a vector, making sure to preserve column names. Then, check for the condition using %in%
:
m <- c(colnames(df), unlist(df))
B$in_df <- B$Data %in% m
A$in_df <- A$Data %in% m
A
#> Data match in_df
#> 1 cy6.CSV 3 TRUE
#> 2 dov.CSV 2 TRUE
#> 3 ds.CSV 2 TRUE
B
#> Data match in_df
#> 1 cy3.CSV 1 TRUE
#> 2 dlt.CSV 1 TRUE
#> 3 dm.CSV 1 TRUE
#> 4 dov_1.CSV 1 TRUE
Upvotes: 3