Reputation: 1356
I need a regular expression that matches tab symbol by the following rules:
"—>text" does not match
"1.—>text" does not match
"1—>text" does not match
"A.—>text" does not match
"text—>text" match
That is, it shouldn't match tabs that are at the beginning of the text, after a listed item mark [A-Z] or [0-9]. Here is my expression:
(?<!^((?:\d+|[A-Z])(?:\.)?))\t(?!\1)
https://regex101.com/r/zgJAG9/1
It does not work for all cases:
How to fix it?
Upvotes: 3
Views: 8446
Reputation: 627126
You can use
(?<!^(?:(?:\d+|[A-Z])\.?)?)\t
See the regex demo. Details:
(?<!^(?:(?:\d+|[A-Z])\.?)?)
- a negative lookbehind that fails the match if, immediately to the left of the current location, there are
^
- start of string(?:(?:\d+|[A-Z])\.?)?
- an optional sequence of
(?:\d+|[A-Z])
- one or more digits or an uppercase ASCII letter\.?
- an optional .
\t
- a tab char.Note that (?:\.)?
is the same as \.?
.
Also, capturing groups inside a negative lookbehind makes little sense as regex processing will be stopped before your backreference pattern is reached.
Upvotes: 3