Reputation: 3223
The obvious extension of question R split on delimiter (split) keep the delimiter (split) is: How to split a string keeping the delimiters at the beginning of each part?
x <- "What is this? It's an onion. What! That's| Well Crazy."
solution
unlist(strsplit(x, "(?<=[?.!|])", perl=TRUE))
gives:
"What is this?" " It's an onion." " What!" " That's|" " Well Crazy."
Whereas I'm looking for:
"What is this" "? It's an onion" ". What" "! That's" "| Well Crazy."
changing the positive lookbehind into positive lookahead doesn't solve the problem.
Upvotes: 1
Views: 71
Reputation: 522741
I managed to solve it using a positive lookahead followed by a word boundary marker:
x <- "What is this? It's an onion. What! That's| Well Crazy."
strsplit(x, "(?=[?.!|].)\\b", perl=TRUE)
[1] "What is this" "? It's an onion" ". What" "! That's"
[5] "| Well Crazy."
Upvotes: 1