Split string with regex separator except when separator is escaped

Question

I have a code (consider 'Z' as escape character, and ',' as separator):

import re

a = 'aaa,bbbZ,cccZZ,dddZZZ,eee'
print re.split(r'(?



Result is:


  ['aaa', 'bbbZ,cccZZ,dddZZZ,eee']


But I need the result processed escaped sequences (in my example escape char is 'Z'):


  ['aaa', 'bbbZ,cccZZ', 'dddZZZ,eee']


When I try to use variable width pattern for negative lookbehind assertion:

print re.split(r'(?


it says:


  sre_constants.error: look-behind requires fixed-width pattern

Wiktor Stribiżew · Accepted Answer

You may match the sequences with a pattern that will either match any chars that are not a comma, or any 1+ commas preceded with odd number of Zs:

import re
a = 'aaa,bbbZ,cccZZ,dddZZZ,eee'
print(re.findall(r'(?:(? ['aaa', 'bbbZ,cccZZ', 'dddZZZ,eee']

See the Python demo and a regex demo.

Pattern details:

(?:(? - 1 or more occurrences of: (? - a Z not immediately preceded with Z (?:ZZ)* - zero or more sequences of ZZ ,+ - 1 or more commas | - or [^,] - any char that is not a comma



With a PyPi regex module, you may use regex.split method with a (?<=(? regex:


import regex
a = 'aaa,bbbZ,cccZZ,dddZZZ,eee'
print(regex.split(r'(?<=(?


See another online Python demo.

Here, the pattern matches 1 or more commas (,+) that are preceded with any 0+ sequences of ZZ that are not preceded with another Z (that is, with an even number of Z).

Split string with regex separator except when separator is escaped

Answers (1)

Related Questions