Rafael
Rafael

Reputation: 3196

split character at deliminator conditionally negative lookahead assertion

I want to split a string at . or : unless the next character is )

Following this question: R strsplit: Split based on character except when a specific character follows why isn't

strsplit("Glenelg (Vic.)",'\\.|:(?!\\))', perl = TRUE)

returning

[[1]]
[1] "Glenelg (Vic)" 

instead it splits at the ., like so:

[1] "Glenelg (Vic" ")"           

Upvotes: 1

Views: 70

Answers (2)

José
José

Reputation: 931

You can also use stringr:

stringr::str_split("Glenelg (Vic.)","[\\.:](?!\\))")
[[1]]
[1] "Glenelg (Vic.)"

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

It is not grouped correctly. \.|:(?!\)) matches a . anywhere in a string or a : not followed with ). If you group . and : patterns, '(?:\\.|:)(?!\\))', it will work.

However, you may use a better regex version based on a character class:

strsplit("Glenelg (Vic.)",'[.:](?!\\))', perl = TRUE)
[[1]]
[1] "Glenelg (Vic.)"

Here, [.:](?!\)) matches either . or : that are both not immediately followed with ).

See the regex demo.

Upvotes: 1

Related Questions