tidyverse: splitting string to data.frame as rows

Question

I want to split a string based on into rows of a data.frame. Code is given below which is not working as required. Any hint.

library(tidyverse)
Test <- "ASD 7
DEF 
 This"

library(stringr)
str_split(string = Test, pattern = "
")
[[1]]
[1] "ASD 7" "DEF "  " This
    
tb <- 
  as_tibble(Test) %>% 
  set_names("Test")

tb %>% 
  str_split(string = Test, pattern = "
")
[[1]]
[1] NA

Warning message:
In stri_split_regex(string, pattern, n = n, simplify = simplify,  :
  NAs introduced by coercion

Required Output

ASD 7
DEF
This

AnilGoyal · Accepted Answer

str_split is designed to work on atomic vectors and not on datasets. It has no argument as data therefore it will work only like this

str_split(tb$Test, '
')

[[1]]
[1] "ASD 7" "DEF "  " This"

OR

> tb %>%
+   mutate(chr_list = str_split(Test, '
'))
# A tibble: 1 x 2
  Test                 chr_list 
                     
1 "ASD 7
DEF 
 This"

Moreover, if you like to do it in database, you may do tidyr::separate or tidyr::separate_rows() like this

tb %>%
  separate_rows(Test, sep = '
')

# A tibble: 3 x 1
  Test   
    
1 "ASD 7"
2 "DEF " 
3 " This"

OR

tb %>%
  separate(Test, into = c('A', 'B', 'C'), sep = '
')

# A tibble: 1 x 3
  A     B      C      
       
1 ASD 7 "DEF " " This"

PS: If you want to remove white spaces too, you may use '\s* +\s*' as separating pattern

tb %>%
  transmute(text_data = map(str_split(Test, '
'), ~ str_trim(.x))) %>%
  unnest_longer(text_data)

# A tibble: 3 x 1
  text_data
      
1 ASD 7    
2 DEF      
3 This

OR

tb %>%
  separate_rows(Test, sep = "\s*
+\s*")

# A tibble: 3 x 1
  Test 
  
1 ASD 7
2 DEF  
3 This

tidyverse: splitting string to data.frame as rows

Answers (1)

Related Questions