MarkusN
MarkusN

Reputation: 3223

Split string on delimiter, keeping delimiter before split

The obvious extension of question R split on delimiter (split) keep the delimiter (split) is: How to split a string keeping the delimiters at the beginning of each part?

x <- "What is this?  It's an onion.  What! That's| Well Crazy."

solution

unlist(strsplit(x, "(?<=[?.!|])", perl=TRUE))

gives:

"What is this?"    "  It's an onion." "  What!" " That's|" " Well Crazy."

Whereas I'm looking for:

"What is this"    "? It's an onion" ".  What" "! That's" "| Well Crazy."

changing the positive lookbehind into positive lookahead doesn't solve the problem.

Upvotes: 1

Views: 71

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522741

I managed to solve it using a positive lookahead followed by a word boundary marker:

x <- "What is this?  It's an onion.  What! That's| Well Crazy."
strsplit(x, "(?=[?.!|].)\\b", perl=TRUE)

[1] "What is this"     "?  It's an onion" ".  What"          "! That's"        
[5] "| Well Crazy."

Demo

Upvotes: 1

Related Questions