Reputation: 67
I have two variables: x and y. x is included in y. For example,
x y
1a abc 1a 39d
2b abc 2b 32i
3c ad ab 3c 32a 32
9d ab acb 9d 2d
N/A abc 329d
I would like to separate y into two parts based on x, like following.
x y1 y2
1a abc 39d
2b abc 32i
3c ad ab 32a 32
93d ab acb 2d
N/A abc 329d
Any suggestions are appreciated, thanks!
I know that gregexpr()
can find the location of a pattern, but how to find the first and last location of the string "x" in order to separate y?
Upvotes: 0
Views: 37
Reputation: 388982
You could almost get what you want using strsplit
by splitting y
on x
df1 <- cbind(df[1], do.call("rbind", strsplit(df$y, df$x)))
df1
# x 1 2
#1 1a abc 39d
#2 2b abc 32i
#3 3c ad ab 32a 32
#4 9d ab acb 2d
#5 N/A abc 329d abc 329d
For "N/A" cases as it will always have two parts we can split it on whitespace and replace them in particular indices.
inds <- df$x == "N/A"
df1[inds, 2:3] <- do.call("rbind", strsplit(df$y[inds], "\\s+"))
df1
# x 1 2
#1 1a abc 329d
#2 2b 329d abc
#3 3c abc 329d
#4 9d 329d abc
#5 N/A abc 329d
Upvotes: 1
Reputation: 47320
Maybe something like this ?
df1 <- read.table(text=
"x y
1a 'abc 1a 39d'
2b 'abc 2b 32i'
3c 'ad ab 3c 32a 32'
9d 'ab acb 9d 2d'
N/A 'abc 329d'",h=T,strin=F)
library(tidyverse)
df1 %>%
mutate(y = ifelse(x == "N/A",
str_replace_all(y," "," | "),
str_replace_all(y,x,"|"))) %>%
separate(y,c("y1","y2"),sep = " \\| ")
# x y1 y2
# 1 1a abc 39d
# 2 2b abc 32i
# 3 3c ad ab 32a 32
# 4 9d ab acb 2d
# 5 N/A abc 329d
Upvotes: 1