Regex, select all not between square brackets

Question

I have a large file of strings containing a lot of "tags" [[STRING]]. I've been trying to use notepad++ to extract these tags using the find and replace with regex enabled. So far all I've managed is to match the contents of [[STRING]].

$$\[([^]]+)$$\]

Can anyone provide me with regex for a search and replace that would leave me with just a list of [[STRING]]'s on new lines?

Thanks

Wiktor Stribiżew · Accepted Answer

You can use an alternation of your pattern with a negated version:

(\[\[[^]]+]])|(?:(?!\[\[[^]]+]]).)+
 ^^^^^^^^^^^        ^^^^^^^^^^^

And replace with $1\n. See the regex demo. The . matches newline should be enabled. If the performance is not great with this one, use an unrolled version:

(\[\[[^]]+]])|[^[]*(?:\[(?!\[)[^[]*)*

See the regex demo

The (?:(?!\[\[[^]]+]]).)+ is a tempered greedy token that is working as a negated character class, but with sequences of characters (matches any text other than "abc").

Then, just remove all empty lines (Edit -> Line Operations -> Remove Empty Lines).

Well, you could also use a simpler regex like (\[\[[^]]+]])|. to replace with $1\n, but it would add too many linebreaks. Actually, that should not be a problem as you can later remove all the empty lines. Just use whatever works best for you.

Regex, select all not between square brackets

Answers (2)

Related Questions