Tono Nam
Tono Nam

Reputation: 36058

Why does the following regex match this text?

I have the regex: (?ms)(?<attribute>\[.+?\]|public|private|\s)+?class

and I have the text:

[attribute]
public int a;

[attribute C]
[attribute B]
public class Test{

}

I would like to know why the regex that I posted matches:

[attribute]
public int a;

[attribute C]
[attribute B]
public class

I think it should match:

[attribute C]
[attribute B]
public class

Correct me if I am wrong. I think the way that the regex is supposed to be read is:

Find either an attribute ( [ some attribute ] ) or public key word or private keyword or space.

So first the regex engine should match [attribute], then the '\n' (new line), then the public keyword. After these, the keyword int is not an option, so why does it match it?

Upvotes: 2

Views: 84

Answers (2)

Ria
Ria

Reputation: 10347

Use this Regex:

((?<attribute>(?:public|private|\[[^\]]+\]))[\r\n\s]+)*class

and give group named attribute. your code can be like this:

foreach (Match match in Regex.Matches(inputString, @"((?<attribute>(?:public|private|\[[^\]]+\]))[\r\n\s]+)*class"))
{
    var attributes = new List<string>();
    foreach (Capture capture in match.Groups["attribute"].Captures)
    {
        attributes.Add(capture.Value);
    }
}

Upvotes: 1

Mark Byers
Mark Byers

Reputation: 838326

The problem is that you are using a dot which matches anything, including close square brackets, whitespace, and (in single-line mode) newlines:

\[.+?\]

You should use this instead:

\[[^\]]+\]

Explanation:

\[     Match a literal open square bracket.
[^\]]  Match any character except a close square bracket.
+      One or more.
\]     Match a literal close square bracket.

Upvotes: 3

Related Questions