Reputation: 1717
Here is my regex:
(?<!PAYROLL)(FIDELITY(?!.*TITLE)(?!.*NATION)|INVEST)(?!.*PAYROLL)
Here is my text
INCOMING WIRE TRUST GS INVESTMENT
VANGUARD PAYROLL
PAYROLL FIDELITY
ACH CREDIT FIDELITY INVESTM-FIDELITY
ACH CREDIT FIDELITY INVESTM-FIDELITY
ACH DEBIT FIDELITY
ACH DEBIT FIDELITY
ACH CREDIT FIDELITY INVESTM-FIDELITY
When running this on http://regexr.com (using the PCRE RegEx Engine), it is matching on "PAYROLL FIDELITY"
, yet I'm specifying a negative lookbehind to not do that(?<!PAYROLL)
.
Any help appreciated.
Upvotes: 2
Views: 78
Reputation: 626748
The (?<!PAYROLL)
negative lookbehind matches a location that is not immediately preceded with PAYROLL
char sequence. In the PAYROLL FIDELITY
string, the FIDELITY
is not immediately preceded with PAYROLL
, it is immediately preceded with PAYROLL
+ space.
You can solve the current problem in various ways. If you are sure there is always a single whitespace between words in the string (say, it is a tokenized string) add \s
after PAYROLL
: (?<!PAYROLL\s)
.
If there can be one or more whitespaces, the (?<!PAYROLL\s+)
pattern won't work in PCRE as PCRE lookbehind patterns must be of fixed width. You might match (some) exceptions and skip them using (*SKIP)(*FAIL)
PCRE verbs:
PAYROLL\s+FIDELITY(*SKIP)(*F)|(FIDELITY(?!.*TITLE)(?!.*NATION)|INVEST)(?!.*PAYROLL)
See the regex demo. You may even replace PAYROLL\s+FIDELITY(*SKIP)(*F)
with PAYROLL.*?FIDELITY(*SKIP)(*F)
or PAYROLL[\s\S]+?FIDELITY(*SKIP)(*F)
to skip any text chunk from PAYROLL
till the leftmost FIDELITY
. PAYROLL\s+FIDELITY(*SKIP)(*F)
matches PAYROLL
, one or more whitespaces, FIDELITY
and then fails the match triggering backtracking, and then the match is skipped and the next match is searched for starting from the index where the failure occurred.
Upvotes: 1