Reputation: 411
I try to catch this string [[....]]
(including brackets)
where ....
can be anything (including non-printable) except ]]
Here is the source where to match :
var myString = 'blablablabla[["<strong>LA DEFENSE 4 TEMPS ( La Rotonde )</strong><br />Centre commercial LES 4 TEMPS",
48.89141725,
2.23478235,
"4T"],
["<strong>ANGERS</strong><br />Centre commercial GEANT",
48.89141725,
2.23478235,
"4T"]]blablablabla'
I try to use this method [^\]]+
to match all chars/non-chars except double bracket. The problem i have is that i do not know how to use this method with a bracket that is immediatly after the first bracket [^\]\]]+
.
Is there a solution with positive/negative lookahead or word boundary ?
(\[\[[^\](?=\])]+)
Any help please ?
Upvotes: 5
Views: 3234
Reputation: 626728
In JavaScript, to match any text between some delimiters that consist of more than one character is best achieved with the [^]
/[\s\S]
/[\d\D]
/[\w\W]
construct with a lazy quantifier (*?
matching 0 or more occurrences, or +?
matching 1 or more occurrences of the preceding subpattern, but as few as possible to return a valid match).
While [^]
construct matching any character including a newline is JavaScript specific, [\s\S]
and its variants are mostly cross-platform constructs that will work in PCRE, .NET, Python, Java, etc. The [...]
in this case is a character class that contains two opposite shorthand classes. Since \s
matches all whitespace characters and \S
matches all non-whitespace characters, this [\s\S]
matches any symbol there is in any input.
I'd recommend to avoid using (.|\n)
. This construct causes more backtracking steps to occur and slows regex search down.
So, you can use
\[\[[\d\D]*?]]
See JS regex demo
Here is a code snippet:
var re = /\[\[[\d\D]*?]]/g;
var str = 'blablablabla[["<strong>LA DEFENSE 4 TEMPS ( La Rotonde )</strong><br />Centre commercial LES 4 TEMPS",\n 48.89141725,\n 2.23478235,\n "4T"],\n ["<strong>ANGERS</strong><br />Centre commercial GEANT",\n 48.89141725,\n 2.23478235,\n "4T"]]blablablabla';
var m;
while ((m = re.exec(str)) !== null) {
console.log(m[0]);
}
UPDATE
In this case, when the delimiters are different and consist of just 2 characters, you can use a technique of matching all characters other than the first symbol of the closing delimiter and then 0 or more sequences of the whole closing delimiter followed by 1 or more occurrences of any symbol other than the first symbol in the closing delimiter.
\[\[[^\]]*(?:][^\]]+)*]]
See regex demo
The linear character of this regex makes it really fast.
P.S. I also want to note that you do not need to escape the ]
outside of character class in JS regex, but it must be escaped inside a character class - always.
Upvotes: 2
Reputation: 5119
Try this:
\[\[(.|\n)*?\]\]
https://regex101.com/r/gR5oJ3/1
It should match anything between and including [[
]]
. The main issue was dealing with newlines, and the (.|\n)
part will match anything including newlines.
Upvotes: 1