Kapitano
Kapitano

Reputation: 163

Find strings not matching pattern in Notepad++ regex

I want to use Notepad++ regex to find all strings that do not match a pattern.

Sample Input Text:

{~Newline~}{~Indent,4~}{~Colour,Blue~}To be or not to be,{~Newline~}{~Indent,6~} {~Colour,Green~}that {~StartItalic~}is{~EndItalic~} the question.{~EndDocument~}

The parts between {~ and ~} are markdown codes. Everything else is plaintext. I want to find all strings which do not have the structure of the markdown, and insert the code {~Plain~} in front of them. The result would look like this:

{~Newline~}{~Indent,4~}{~Colour,Blue~}{~Plain~}To be or not to be,{~Newline~}{~Indent,6~}{~Colour,Green~}{~Plain~}that {~StartItalic~}{~Plain~}is{~EndItalic~}{~Plain~} the question.{~EndDocument~}

The markdown syntax is open-ended, so I can't just use a list of possible codes to not process.

I could insert {~Plain~} after every ~}, then delete every {~Plain~} that's followed by {~, but that seems incredibly clunky.

Upvotes: 1

Views: 1748

Answers (2)

GalAbra
GalAbra

Reputation: 5148

You need to use Negative Lookahead. This regex will match all ~} occurrences, so you can just replace them with ~}{~Plain~}:

~}(?!{~|$)

If you don't want to match the space in {~Indent,6~} {~Colour,Green~}, just use this:

~}(?!{~|$| )

Upvotes: 1

Andrey Tyukin
Andrey Tyukin

Reputation: 44918

I hope this works with the current version of Notepad++ (don't have it right now).

Matching with:

~}((?:[^{]|(?:{[^~]))+){~

and then replacing by

~}{~Plain~}$1{~

might work. The first group should capture everything between closing ~} and the next {~. It will also match { and } in the text, as long as they are not part of an opening tag {~.

EDIT Additional explanation, so you can modify it better:

~}             end of previous tag
(              start of the "interesting" group that contains text
  (?:          non-capturing group for +
    [^{]       everything except opening braces
    |          OR
    (?:  
      {        opening brace followed by ...
      [^~]       ... some character which is not `~`
    )
  )+           end of non-capturing group for +, repeated 1 or more times
)              end of the "interesting" group
{~             start of the next tag

Here is an interactive example: regex101 example

Upvotes: 2

Related Questions