Edward
Edward

Reputation: 29976

Regex With Nested Square brackets

I have such string ('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> ''), I want to use regex to get the items below:

I tried like \[.*?\], but it will return with [Test.EVAPCT[0].

Upvotes: 1

Views: 1217

Answers (2)

The fourth bird
The fourth bird

Reputation: 163217

Using \[.*?\] starts the match with [ and matches till the first occurrence of ] where .*? can also match [ and therefore matches too much.

You could match the digits between the square brackets to make it a bit more specific:

[^\][]+\[\d+\]

The pattern matches

  • [^\][]+ Match any char except the square brackets using a negated character class
  • \[\d+\] Match 1+ digits between the square brackets

Regex demo

A bit more broader variant could be matching optional chars other than [ ] or a whitspace char before the square bracket.

[^\s\][()']*\[[^\s\][]+\]

The pattern matches:

  • [^\s\][()']* Optionally match chars other than the listed in the character class
  • \[ Match [
  • [^\s\][]+ Match 1+ chars other than [ ] or a whitespace char
  • \] Match the closing ]

Regex demo

const str = `('[Test.F]' <>'' OR '[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '')`;
const regex = /[^\s\][()']*\[[^\s\][]+\]/g;
console.log(str.match(regex));

Matching Test.F instead of [Test.F] using a capture group:

\[([^\][]*(?:\[[^\][]*])?)]

Regex demo

const str = `('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '') [Test.F]`;
const regex = /\[([^\][]*(?:\[[^\][]*])?)]/g;
console.log(Array.from(str.matchAll(regex), m => m[1]));

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

You can use

\w+(?:\.\w+)*\[[^\][]*]
\w+(?:\.\w+)*\[\d+]

See the regex demo. Details:

  • \w+ - one or more word chars
  • (?:\.\w+)* - zero or more sequences of a . and one or more word chars
  • \[ - a [ char
  • [^\][]* - zero or more chars other than [ and ] / \d+ - one or more digits
  • ] - a ] char.

See a demo below:

const text = "('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '')";
const regex = /\w+(?:\.\w+)*\[[^\][]*]/g;
console.log( text.match(regex) );

To also cater for cases like [Test.F] you may use a regex following a bit different logic:

/(?<=\[)\w+(?:\.\w+)*(?:\[[^\][]*])?(?=])/g

See this regex demo and the demo below:

const text = "('[Test.A[0]]' <>'' OR '[Test.B[0]]' <>'' OR '[Test.C[0]]' <>'' OR '[Test.D[0]]' <> '') [Test.F]";
const regex = /(?<=\[)\w+(?:\.\w+)*(?:\[[^\][]*])?(?=])/g;
console.log( text.match(regex) );

Details:

  • (?<=\[) - a location right after a [ char
  • (\w+(?:\.\w+)*(?:\[[^\][]*])?) - Group 1: one or more word chars, and then zero or more sequences of . and one or more word chars, and then an optional occurrence of a [...] substring
  • (?=]) - a location right before a ] char.

Upvotes: 1

Related Questions