jaks
jaks

Reputation: 4587

c# Regular expression problem

I am trying to filter out some text based on regex like phone* means i want the text "Phone booth", "phone cube" etc.

But when I give booth* it selects Phone booth also. It should not select it rite? Here is the code,

string[] names = { "phone booth", "hall way", "parking lot", "front door", "hotel lobby" };

        string input = "booth.*, door.*";
        string[] patterns = input.Split(new char[] { ',' });
        List<string> filtered = new List<string>();

        foreach (string pattern in patterns)
        {
            Regex ex = null;
            try
            {
                ex = new Regex(pattern.Trim());
            }
            catch { }
            if (ex == null) continue;

            foreach (string name in names)
            {
                if (ex.IsMatch(name) && !filtered.Contains(name)) filtered.Add(name);
            }
        }

        foreach (string filteredName in filtered)
        {
            MessageBox.Show(filteredName);
        }

It displays "Phone booth" and "front door". But as per my criteria, it should not show anything, bcoz no string is starting with booth or door.

Is any problem in my regex?

Upvotes: 1

Views: 372

Answers (5)

Sean Vieira
Sean Vieira

Reputation: 159875

The problem is that you are not specifying that the string must start with booth or door, simply that the string must contain booth or door followed by a string of zero-length or greater.

If however, you change your Regex to be ^booth.* and ^door.*, everything should work.

Caret ( ^ ) it should be noted, means "The beginning of the line / string" (depending on whether or not your regular expression is in multiline mode -- i.e. if . will match newline characters.)

Upvotes: 3

Abe Miessler
Abe Miessler

Reputation: 85046

You need to specify the start of the string in your regex if you don't want "phone booth" to match.

Example:

^booth.*

will match "booth" but not "phone booth".

booth.*

Will match any string that has "booth" in it.

Upvotes: 1

Steve Townsend
Steve Townsend

Reputation: 54138

Your Regex does not specify that the location of the matching string in pattern is location-constrained. If you want to ensure that you only match initial substrings, you have to specify '^' as the first part of the pattern.

See http://msdn.microsoft.com/en-us/library/az24scfc.aspx for more details.

Upvotes: 0

Ioannis Karadimas
Ioannis Karadimas

Reputation: 7896

Yes, you should prefix your patterns with "^", like so:

string input = "^booth.*, ^door.*";

This will tell C# you want only what's starting with "booth" or "door". More info here: http://oreilly.com/windows/archive/csharp-regular-expressions.html

Upvotes: 1

Mitchel Sellers
Mitchel Sellers

Reputation: 63126

If you want to match at the beginning of a string start with ^

So, for example if you wanted a match to start with phone, then contain characters after that, you could do the following

^phone.*

The ^ anchors the match to the start of the string.

Upvotes: 5

Related Questions