Reputation: 999
Consider the following list of data frames:
library(tidyverse)
df1 <- tibble(
id = 1:5,
A = LETTERS[1:5],
B = letters[10:14]
)
df2 <- tibble(
id = 1:3,
A = LETTERS[1:3],
B = paste(LETTERS[1:3], letters[10:12])
)
df3 <- tibble(
id = 1:6,
B = paste(LETTERS[1:6], letters[10:15])
)
df4 <- tibble(
id = 1:4,
C = paste(LETTERS[15:18], letters[20:23])
)
df_ls <- list(df1, df2, df3, df4) %>%
set_names(paste0("df", 1:4))
I would like to concatenate the elements of A
and B
into the B
column if that's not already the case. Note that not all the data frame have a B
column.
The conditions to do this are as follow:
A
and B
columnsB
must be different than that of A
I'm working with map
functions. My attempt so far (without "condition 2"):
df_ls %>%
map(
~ .x %>%
mutate_at(
vars(matches("B")),
~ {
if (c("A", "B") %in% colnames(.) %>% sum() == 2)
paste(A, B)
else
B
}
)
)
It doesn't work.
Also, I don't manage to write my second condition. I tried & setequal(. %>% pull(A), . %>% pull(B) %>% word(1))
, without success.
Edit:
I need to keep all the data frames separately. Only the B
column in df1
should be rewritten. df2
, df3
and df4
should remain unchanged.
The expected output is:
$df1
# A tibble: 5 x 3
id A B
<int> <chr> <chr>
1 1 A A j
2 2 B B k
3 3 C C l
4 4 D D m
5 5 E E n
$df2
# A tibble: 3 x 3
id A B
<int> <chr> <chr>
1 1 A A j
2 2 B B k
3 3 C C l
$df3
# A tibble: 6 x 2
id B
<int> <chr>
1 1 A j
2 2 B k
3 3 C l
4 4 D m
5 5 E n
6 6 F o
$df4
# A tibble: 4 x 2
id C
<int> <chr>
1 1 O t
2 2 P u
3 3 Q v
4 4 R w
Upvotes: 3
Views: 1058
Reputation: 7724
You can first check whether A and B are in the columnames, if yes then check whether the first element (str_sub(B, 1, 1)
) does not match A, if yes then combine A and B
With map_if
as suggested by @Moody_Mudskipper
df_ls %>%
map_if(~ all(c("A", "B") %in% colnames(.x)),
~ mutate(.x, B = if_else(str_sub(B, 1, 1) != A, paste(A, B), B)))
More verbose:
df_ls %>%
map(~ {if (all(c("A", "B") %in% colnames(.x))) {
.x %>%
mutate(B = if_else(str_sub(B, 1, 1) != A, paste(A, B), B))
} else {
.x
}})
# $df1
# # A tibble: 5 x 3
# id A B
# <int> <chr> <chr>
# 1 1 A A j
# 2 2 B B k
# 3 3 C C l
# 4 4 D D m
# 5 5 E E n
#
# $df2
# # A tibble: 3 x 3
# id A B
# <int> <chr> <chr>
# 1 1 A A j
# 2 2 B B k
# 3 3 C C l
#
# $df3
# # A tibble: 6 x 2
# id B
# <int> <chr>
# 1 1 A j
# 2 2 B k
# 3 3 C l
# 4 4 D m
# 5 5 E n
# 6 6 F o
#
# $df4
# # A tibble: 4 x 2
# id C
# <int> <chr>
# 1 1 O t
# 2 2 P u
# 3 3 Q v
# 4 4 R w
Upvotes: 7
Reputation: 11981
I am not sure if I understood your question, but here is a try to answer it:
bind_rows(df_ls) %>% #create on tibble with all data.frames
select(id, A, B) %>% #select relevant columns
filter_at(vars("A", "B"), all_vars(!is.na(.))) %>% #keep only those rows which have columns A and B (condition 1)
mutate(B = if_else(str_extract(A, "^.") != str_extract(B, "^."), paste(A, B), B)) #if the first letter of B is the same as the first letter in A then keep B otherwise paste A and B together (condition 2)
# A tibble: 8 x 3
id A B
<int> <chr> <chr>
1 1 A A j
2 2 B B k
3 3 C C l
4 4 D D m
5 5 E E n
6 1 A A j
7 2 B B k
8 3 C C l
after you posted your desired results here is a way to keep the list:
myfun <- function(df){
if ("A" %in% colnames(df) & "B" %in% colnames(df)) {
mutate(df, B = if_else(str_extract(A, "^.") != str_extract(B, "^."), paste(A, B), B))
} else df
}
df_ls %>% map(myfun)
Upvotes: 1