Ali
Ali

Reputation: 71

How do I restrict regular expression from taking certain words

I have created a regular expression Regex for string that starts from " and ends with " eg: "mynameis"

"\"(?:[^\"\\]|\\.)*\""

Now I want that this expression must not take {we, us, they, and} words. How do I do that? For instance if I input "mynameisalexand" Compiler must ignore {and} and take this string as "mynameisalex"

Upvotes: 1

Views: 581

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627410

Since there is no way to match non-continuous text with regex, you can still use your regex or an unrolled one:

"[^"\\]*(?:\\.[^"\\]*)*"

See the regex demo

and remove the substrings you defined with a mere String.Replace (or with a regex like we|and|...).

See the C# demo:

var input = "\"mynamesarealexandandrew\" \"mynameisalexand\"";
var regex = new Regex(@"""[^""\\]*(?:\\.[^""\\]*)*""", RegexOptions.IgnorePatternWhitespace);
var results = regex.Matches(input).Cast<Match>()
                   .Select(p => p.Value.Replace("we", "")
                                       .Replace("us", "")
                                       .Replace("they", "")
                                       .Replace("and", ""))
                   .ToList();
foreach (var s in results)    // DEMO
{
    Console.WriteLine(s);
}

Upvotes: 1

Steve Cooper
Steve Cooper

Reputation: 21490

You'll need to clean the string up afterwards; regex just isn't powerful enough.

In fact, what you've got is a context-free grammar! If we call your acceptable tokens an 'id', then you've defined a language that looks like this;

id (('and'|'we'|'us') id?)* 

That is, at least one id; then the words and, we, or us, then another possible id, maybe. The whole thing then repeats, allowing you to match

mynameisandrewbutheyarebothcalledsarah

as id: mynameis 'and' id: rewbut 'they' id: arebothcalledsarah

So, this is what's known as a context-free language, and regex can't parse that kind of thing. Your best bet is to split on the unacceptable words and just stitch them together at the end.

Upvotes: 0

Related Questions