Reputation: 310
Say I have a string 'ad>ad>ad>>ad'
and I want to split on this on the '>'
(not the '>>'
chars). Just picked up regex and was wondering if there is a way (special character) to split on a specific part of the matched expression, rather than splitting on the whole matched expression, for example the regex could be:
re.split('[^>]>[^>]', 'ad>ad>ad>>ad')
Can you get it to split on the char in parenthesis [^>](>)[^>]
?
Upvotes: 3
Views: 222
Reputation:
Try with \b>\b
This will check for single >
surrounded by non-whitespace characters. As the string in the question is continuous stream of characters checking word boundary with \b
is simplest method.
Upvotes: 1
Reputation: 626728
You need to use lookarounds:
re.split(r'(?<!>)>(?!>)', 'ad>ad>ad>>ad')
See the regex demo
The (?<!>)>(?!>)
pattern only matches a >
that is not preceded with a <
(due to the negative lookbehind (?<!>)
) and that is not followed with a <
(due to the negative lookahead (?!>)
).
Since lookarounds do not consume the characters (unlike negated (and positive) character classes, like [^>]
), we only match and split on a <
symbol without "touching" the symbols around it.
Upvotes: 2