tk--
tk--

Reputation: 114

RegEx to find nested Code Blocks

I'm writing a code formatter and I need some help. I have to find the code blocks and I want to use regular expressions. The code I need to format looks basically like this:

KEYWORD name {
    word
    word
    ...
}

I am able to find the blocks that start with { and end with } with this expression:

[{](.*?)[}]

But I don't know how to add the "KEYWORD name" part to the expression. Both are custom strings that can contain any character except ;, { and }.

Another problem is that my code blocks can be nested. I don't know how to add that feature.

Upvotes: 1

Views: 2707

Answers (2)

Jordi
Jordi

Reputation: 5908

(.+?)\s+(.+?)\s+{(.*?)}

This is: Anything that's not a space, followed by one or more whitespace characters, followed by anything that's not a space, one or more whitespace characters, and your code block.

If the KEYWORD can only contain uppercase letters and the name, let's say all letters, digits and underscores, it should look like this:

([A-Z]+?)\s+([A-Za-z0-9_+?)\s+\{(.*?)\}

Note that if your code blocks can be nested, you'll have problems with this regex, as it will match both the first { as well as the first }.

Upvotes: 2

codaddict
codaddict

Reputation: 455292

You can just do:

KEYWORD name {.*?}

Since you want the . to match newline as well you'll have to use the multi-line mode.

Since both KEYWORD and name are arbitrary strings that can contain any character except ; , { and }:

[^;,{}]+\s+[^;,{}]+\s*{.*?}

Upvotes: 3

Related Questions