Nettle
Nettle

Reputation: 3321

Move string to new column and replace with NA

I'd like to move a value to a new column, specifically:

1. Detect presence of a regex (string), and if TRUE...
2. Move value to a new column, and...
3. Replace original string with NA

I don't know of a current move or extract function to do this, so my thought was to create one of my own. I can't figure out the quosures.

library(tidyverse)

# Data
x <- tibble(col1 = letters[1:5])

#> # A tibble: 5 x 1
#>   col1 
#>   <chr>
#> 1 a    
#> 2 b    
#> 3 c    
#> 4 d    
#> 5 e

Here's the outcome I'd like to put into a tidy function.

x %>% 
  mutate(col2 = case_when(                         #<Detect regex; copy to col2
                  str_detect(col1, "[a]") ~ col1),
         col1 = case_when(                         #<remove from col1
                  col1 %in% col2 ~ "",             #<This should be NA
                  TRUE ~ col1),
         col1 = parse_character(col1))             #<parse col1 to NA

#> # A tibble: 5 x 2
#>   col1  col2   
#>   <chr> <chr>  
#> 1 <NA>  a      
#> 2 b     NA
#> 3 c     NA
#> 4 d     NA
#> 5 e     NA

The function could look like this

move_to_newcol <- function(my.dataframe, 
                        my.new.col.name, 
                        my.old.col.name, 
                        my.regex){...}

Created on 2018-06-19 by the reprex package (v0.2.0).

Upvotes: 1

Views: 462

Answers (2)

Nettle
Nettle

Reputation: 3321

Using friendlyeval, a tidyeval 'simplified API' by Miles McBain (author of datapasta):

library(tidyverse)
library(friendlyeval)

# Data
x <- tibble(col1 = letters[1:5])


move_to_newcol <- function(my.dataframe, my.old.col.name, my.new.col.name, my.regex){

  #Treat the literal text input provided as a dplyr column name.
    my.old.col.name <- treat_input_as_col(my.old.col.name)
    my.new.col.name <- treat_input_as_col(my.new.col.name)

  # friendlyeval looks almost identical to dplyr code
  x %>%  
    mutate(!!my.new.col.name := case_when(
                      str_detect(!!my.old.col.name, my.regex) ~ !!my.old.col.name),
           !!my.old.col.name := case_when(
                      !!my.old.col.name == !!my.new.col.name ~ NA_character_,
                      TRUE ~ !!my.old.col.name))
}

move_to_newcol(x, col1, col2, "[a]")

#> # A tibble: 5 x 2
#>   col1  col2 
#>   <chr> <chr>
#> 1 <NA>  a    
#> 2 b     <NA> 
#> 3 c     <NA> 
#> 4 d     <NA> 
#> 5 e     <NA>
```

Created on 2018-06-23 by the reprex package (v0.2.0).

Upvotes: 1

gymbrane
gymbrane

Reputation: 167

How about something like this...

EDIT:

move_to_newcol <- function(df, old_col, new_col, regex){
  old_col_var <- dplyr::enquo(old_col)
  new_col_var <- dplyr::enquo(new_col)
  oldcol_name <- quo_name(old_col_var)
  newcol_name <- quo_name(new_col_var)

  dplyr::mutate(df , !! newcol_name := dplyr::case_when(stringr::str_detect((!! old_col_var), regex) ~ (!!old_col_var))) %>%
  dplyr::mutate(!! oldcol_name := dplyr::case_when(!! old_col_var %in% !!new_col_var ~  NA_character_ , TRUE ~ !! old_col_var))
}

You had the bones already, I believe. You can then test it out which delivers what seems like what you want.

move_to_newcol(x, col1, col2, "[a]")
# A tibble: 5 x 2
 col1  col2 
<chr> <chr>
1 NA    a    
2 b     NA   
3 c     NA   
4 d     NA   
5 e     NA

or

x %>% move_to_newcol(col1,col2, "[a]")

Upvotes: 1

Related Questions