MYK
MYK

Reputation: 845

Exclude a chunk enclused in a string in regular expression

I have a text similar to:

In which marked text should match before, but <tag>marked should not match inside a tag</tag>. Also marked should matched after the tag. <tag>This marked should not match either</tag>

For this example of text, bold instances of marked should get matched, but not the one which is inside <tag>. The nearest I was able to go was https://regex101.com/r/CyxVZ3/1 which ignores all matches before </tag>.

Few updates from comments:

Upvotes: 0

Views: 171

Answers (1)

Nahuel Fouilleul
Nahuel Fouilleul

Reputation: 19315

if engines supports backtracking control verbs (Perl,PHP) :

<tag>.*?<\/tag>(*SKIP)(?!)|(?:(?!<tag>).)*

otherwise it's not possible with one regex it will need some more code.

After reading comments in java 7 it can be done using scanner and using regex as delimiter, for example:

String string = "In which marked text should match before, but <tag>marked should not match inside a tag</tag>. Also marked should matched after the tag.<tag>This marked should not match either</tag> done";
try ( Scanner scanner = new Scanner( string ) ) {
    scanner.useDelimiter( "<tag>.*?</tag>" );
    while ( scanner.hasNext() ) {
        System.out.println( scanner.next() );
    }
}

Upvotes: 1

Related Questions