Reputation: 1091
source string
<html name="abc:///Testers/something.txt" in="abc/Testers/something.txt" loci="123" sap="abcdefgh="/><html name="abc:///needed.txt" src="abc/needed.txt" location="123" sap="rtyghu"/><html name="abc:///Testers/Testers3/Another.txt" in="abc/Testers/Testers3/Another.txt" loci="123" sap="jhkiopjhg"/><html name="abc:///onemore.txt" src="abc/onemore.txt" location="123" sap="dfrtyu"/>
How do I match the section starting from <html name=" not followed by (needed) or (onemore) and ending with />
So in this string there should be two matches which are
<html name="abc:///Testers/something.txt" in="abc/Testers/something.txt" loci="123" sap="abcdefgh="/>
<html name="abc:///Testers/Testers3/Another.txt" in="abc/Testers/Testers3/Another.txt" loci="123" sap="jhkiopjhg"/>
I tried this - <html name=(?!(needed|onemore)).*?"\/>
It doesnt work as I am confused with the non greedy and negative lookahead stuffs.
Upvotes: 1
Views: 2740
Reputation: 8833
Here is the breakdown of your regex <html name=(?!(needed|onemore)).*?"\/>
<html name=(?!(needed|onemore)).*?"\/>
1) Literal match: <html name=
2) Not followed by: "needed" or "onemore"
3) Lazy grab all: .*?
Until Literal match: "/>
What you need to do is check for needed or onemore with each character grab using another grouping like this <html name=(?:(?!(needed|onemore)).)*?"\/>
. That will check that "needed" or "onemore" isn't next on each character grab. (I would also recommend using [^>]
instead of .
so that you don't need the lazy quantifier.)
However, I would recommend using something like this for your filtering <html name=([^>no]|n(?!eeded)|o(?!nemore))*>
. Much easier to adapt and less work for the regex engine.
Upvotes: 2