Junitar
Junitar

Reputation: 999

Purrr - conditionally mutate a column in a list of data frames when it exists

Consider the following list of data frames:

library(tidyverse)

df1 <- tibble(
  id = 1:5,
  A = LETTERS[1:5],
  B = letters[10:14]
)
df2 <- tibble(
  id = 1:3,
  A = LETTERS[1:3],
  B = paste(LETTERS[1:3], letters[10:12])
)
df3 <- tibble(
  id = 1:6,
  B = paste(LETTERS[1:6], letters[10:15])
)
df4 <- tibble(
  id = 1:4,
  C = paste(LETTERS[15:18], letters[20:23])
)

df_ls <- list(df1, df2, df3, df4) %>% 
  set_names(paste0("df", 1:4))

I would like to concatenate the elements of A and B into the B column if that's not already the case. Note that not all the data frame have a B column.

The conditions to do this are as follow:

  1. the data frame must have both A and B columns
  2. the first letter in B must be different than that of A

I'm working with map functions. My attempt so far (without "condition 2"):

df_ls %>% 
  map(
    ~ .x %>% 
      mutate_at(
        vars(matches("B")),
        ~ {
          if (c("A", "B") %in% colnames(.) %>% sum() == 2)
            paste(A, B)
          else
            B
        }
      )
  )

It doesn't work.

Also, I don't manage to write my second condition. I tried & setequal(. %>% pull(A), . %>% pull(B) %>% word(1)), without success.

Edit:
I need to keep all the data frames separately. Only the B column in df1 should be rewritten. df2, df3 and df4 should remain unchanged.
The expected output is:

$df1
# A tibble: 5 x 3
   id A     B
<int> <chr> <chr>
1     1 A     A j
2     2 B     B k
3     3 C     C l
4     4 D     D m
5     5 E     E n   

$df2
# A tibble: 3 x 3
     id A     B    
  <int> <chr> <chr>
1     1 A     A j  
2     2 B     B k  
3     3 C     C l  

$df3
# A tibble: 6 x 2
     id B    
  <int> <chr>
1     1 A j  
2     2 B k  
3     3 C l  
4     4 D m  
5     5 E n  
6     6 F o  

$df4
# A tibble: 4 x 2
     id C    
  <int> <chr>
1     1 O t  
2     2 P u  
3     3 Q v  
4     4 R w  

Upvotes: 3

Views: 1058

Answers (2)

kath
kath

Reputation: 7724

You can first check whether A and B are in the columnames, if yes then check whether the first element (str_sub(B, 1, 1)) does not match A, if yes then combine A and B

With map_if as suggested by @Moody_Mudskipper

df_ls %>% 
  map_if(~ all(c("A", "B") %in% colnames(.x)), 
         ~ mutate(.x, B = if_else(str_sub(B, 1, 1) != A, paste(A, B), B)))

More verbose:

df_ls %>% 
  map(~ {if (all(c("A", "B") %in% colnames(.x))) {
   .x %>% 
      mutate(B = if_else(str_sub(B, 1, 1) != A, paste(A, B), B))
  } else {
    .x
  }})

# $df1
# # A tibble: 5 x 3
#      id A     B    
#   <int> <chr> <chr>
# 1     1 A     A j  
# 2     2 B     B k  
# 3     3 C     C l  
# 4     4 D     D m  
# 5     5 E     E n  
# 
# $df2
# # A tibble: 3 x 3
#      id A     B    
#   <int> <chr> <chr>
# 1     1 A     A j  
# 2     2 B     B k  
# 3     3 C     C l  
# 
# $df3
# # A tibble: 6 x 2
#      id B    
#   <int> <chr>
# 1     1 A j  
# 2     2 B k  
# 3     3 C l  
# 4     4 D m  
# 5     5 E n  
# 6     6 F o  
# 
# $df4
# # A tibble: 4 x 2
#      id C    
#   <int> <chr>
# 1     1 O t  
# 2     2 P u  
# 3     3 Q v  
# 4     4 R w

Upvotes: 7

Cettt
Cettt

Reputation: 11981

I am not sure if I understood your question, but here is a try to answer it:

bind_rows(df_ls) %>% #create on tibble with all data.frames 
      select(id, A, B) %>% #select relevant columns
      filter_at(vars("A", "B"), all_vars(!is.na(.))) %>% #keep only those rows which have columns A and B (condition 1)
      mutate(B = if_else(str_extract(A, "^.") != str_extract(B, "^."), paste(A, B), B)) #if the first letter of B is the same as the first letter in A then keep B otherwise paste A and B together (condition 2)


# A tibble: 8 x 3
     id A     B    
  <int> <chr> <chr>
1     1 A     A j  
2     2 B     B k  
3     3 C     C l  
4     4 D     D m  
5     5 E     E n  
6     1 A     A j  
7     2 B     B k  
8     3 C     C l 

Update:

after you posted your desired results here is a way to keep the list:

myfun <- function(df){
  if ("A" %in% colnames(df) & "B" %in% colnames(df)) {
    mutate(df, B = if_else(str_extract(A, "^.") != str_extract(B, "^."), paste(A, B), B))
  } else df
}

df_ls %>% map(myfun)

Upvotes: 1

Related Questions