user2357446
user2357446

Reputation: 676

Regex word boundaries capturing wrong words

I am having some difficulties trying to get my simple Regex statement in C# working the way I want it to.

If I have a long string and I want to find the word "executive" but NOT "executives" I thought my regex would look something like this:

Regex.IsMatch(input, string.Format(@"\b{0}\b", "executive");

This, however, is still matching on inputs that contain only executives and not executive (singular).

I thought word boundaries in regex, when used at the beginning and end of your regex text, would specify that you only want to match that word and not any other form of that word?

Edit: To clarify whats happening, I am trying to find all of the Notes among Students that contain the word executive and ignoring words that simply contain "executive". As follows:

var studentMatches =
    Students.SelectMany(o => o.Notes)
        .Where(c => Regex.Match(c.NoteText, string.Format(@"\b{0}\b", query)).Success).ToList();

where query would be "executive" in this case.

Whats strange is that while the above code will match on executives even though I don't want it to, the following code will not (aka it does what I am expecting it to do):

foreach (var stu in Students)
{
    foreach (var note in stu.Notes)
    {

        if (Regex.IsMatch(note.NoteText, string.Format(@"\b{0}\b", query)))
            Console.WriteLine(stu.LastName);
    }
}

Why would a nested for loop with the same regex code produce accurate matches while a linq expression seems to want to return anything that contains the word I am searching for?

Upvotes: 0

Views: 121

Answers (1)

Alexander Petrov
Alexander Petrov

Reputation: 14231

Your linq query produces the correct result. What you see is what you have written.

Let's give proper names to make it clear

var noteMatches = Students.SelectMany(student => student.Notes)
    .Where(note => Regex.Match(note.NoteText, string.Format(@"\b{0}\b", query)).Success)
    .ToList();

In this query after executing SelectMany we received a flattened list of all notes. Thus was lost the information about which note belonged to which student.

Meanwhile, in the sample code with foreach loops you output information about the student.

I can assume that you need a query like the following

var studentMatches = Students.Where(student => student.Notes
        .Any(note => Regex.IsMatch(note.NoteText, string.Format(@"\b{0}\b", query))))
    .ToList();

However, it is not clear what result you want to obtain if the same student will have notes containing both executive and executives.

Upvotes: 1

Related Questions