Reputation: 2579
I am a novice with Regex usage in C#. I want a regex to find the next keyword from a given list but which is not surrounded by the quotes.
e.g. if i have a code which looks like:
while (t < 10)
{
string s = "get if stmt";
u = GetVal(t, s);
for(;u<8;u++)
{
t++;
}
}
i tried using the Regex as @"(.*?)\s(FOR|WHILE|IF)\s" but it gives me the "if" as next keyword. whereas, i want to get the next keyword after while as "for" and not as "if" which is surrounded by quotes.
Can it be done in anyway using Regex? Or i will have to use conventional programming?
Upvotes: 3
Views: 4777
Reputation: 147240
Try the following RegEx (Edit: fixed).
(?:[^\"]|(?:(?:.*?\"){2})*?)(?: |^)(?<kw>for|while|if)[ (]
Note: Because this RegEx literal includes quotes, you can't use the @ sign before the string. Remember that if you add any RegEx special chars to the string, you'll need to double-escape them appropiatlye (e.g. \w). Insure that you also specify the Multiline parameter when matching with the RegEx, so the caret (^) is treated as the start of a new line.
This hasn't been tested, but should do the job. Let me know if there's any problems. Also, depending on what more you want to do here, I might recommend using standard text-parsing (non-RegEx), as it will quickly become more readable depending on how much data you want to extract from the code. Hope that helps anyway.
Edit: Here's some example code, which I've tested and am pretty confident that it works as intended.
var input = "while t < 10 loop\n s => 'this is if stmt'; for u in 8..12 loop \n}";
var pattern = "(?:[^\"]|(?:(?:.*?\"){2})*?)(?: |^)(?<kw>for|while|if)[ (]";
var matches = Regex.Matches(input, pattern);
var firstKeyword = matches[0].Groups["kw"].Value;
// The following line is a one-line solution for .NET 3.5/C# 3.0 to get an array of all found keywords.
var keywords = matches.Cast<Match>().Select(match => match.Groups["kw"].Value).ToArray();
Hopefully this should be your complete solution now...
Upvotes: 2
Reputation: 536329
Can it be done in anyway using Regex?
In the general case, no. The syntax of C# is not amenable to regex parsing.
Consider these corner cases:
method("xxx\"); while (\"xxx");
method(@"xxx \"); while (...);
// while
/* while */
/* xxx
// xxx */ while
/* xxx " xxx */ while ("...
Languages as complex as C# need dedicated parsers.
Upvotes: 0
Reputation: 16364
If you decide to go the Regex route you can use this site to test your regular expression
Upvotes: 1
Reputation: 5559
I suppose Regex, can not readily understand C# keywords. I would suggest you to use : Microsoft.CSharp.CSharpCodeProvider, using this Visual studio manages C# code.
Upvotes: 0
Reputation: 60987
You can try backreferencing, which would let you match the string, but since you want to do the exact opposite you'd be better of escaping the string instead, that's actually really easy.
Either write a regex that matches strings and replaces them with nothing, or run through the text skipping quoted strings and looking for keywords in the mean time. I recon the latter will be more efficient.
Upvotes: 0