ggg
ggg

Reputation: 93

R function to divide a character vector into prefix + string if the prefix is present

I have a character vector like this:

id <-c("A01A01", "A01B01", "A01C01", "R", "S", "T")

I need to cut a prefix from all the strings that contains it, keeping all the substrings My expected output are then 2 other vectors like these:

a <-c("A01", "A01", "A01", "", "", "")

b <-c("A01", "B01", "C01", "R", "S", "T")

Upvotes: 1

Views: 135

Answers (3)

akrun
akrun

Reputation: 887601

An option with extract from tidyr

library(tidyr)
library(dplyr)
tibble(id) %>% 
   extract(id, into = c('a', 'b'), '(A01)?(.*)')
# A tibble: 6 × 2
  a     b    
  <chr> <chr>
1 "A01" A01  
2 "A01" B01  
3 "A01" C01  
4 ""    R    
5 ""    S    
6 ""    T    

Upvotes: 0

GKi
GKi

Reputation: 39707

You can use sub like:

sub("^(A01).*|.*", "\\1", id)
#[1] "A01" "A01" "A01" ""    ""    ""   

sub("^A01", "", id)
#[1] "A01" "B01" "C01" "R"   "S"   "T"  

where ^(A01).*|.* matches A01 in the beginning or everything and \\1 inserts A01 if it matches.

Another option would be a look behind in strsplit.

strsplit(id, "(?<=^A01)", perl=TRUE)

Upvotes: 2

Karthik S
Karthik S

Reputation: 11596

Does this work:

a <- sapply(id, function(x) if(nchar(x)%%2 == 0) substr(x, 1,nchar(x)/2) else '', USE.NAMES = FALSE)
b <- sapply(id, function(x) if(nchar(x)%%2 == 0) substr(x,nchar(x)/2+1, nchar(x)) else x,USE.NAMES = FALSE)
a
[1] "A01" "A01" "A01" ""    ""    ""   
b
[1] "A01" "B01" "C01" "R"   "S"   "T"  

Upvotes: 0

Related Questions