Mr. Spock
Mr. Spock

Reputation: 685

How do I simplify this pattern matching logic using C# Regex?

Good morning! Hoping someone can help me out here with some pattern matching.

What I want to do is match a string of numbers against a bunch of text. The only catch is that I DO NOT want to match anything that has more numbers to the left and/or right of the number I'm looking for (letters are fine).

Here is some code that works, but it seems that having three IsMatch calls is overkill. Problem is, I can't figure out how to reduce it to just one IsMatch call.

static void Main(string[] args)
{
    List<string> list = new List<string>();

    list.Add("cm1312nfi"); // WANT
    list.Add("cm1312");  // WANT
    list.Add("cm1312n"); // WANT
    list.Add("1312");    // WANT
    list.Add("13123456"); // DON'T WANT
    list.Add("56781312"); // DON'T WANT
    list.Add("56781312444"); // DON'T WANT

    list.Add(" cm1312nfi "); // WANT
    list.Add(" cm1312 ");    // WANT
    list.Add("cm1312n ");    // WANT
    list.Add(" 1312");       // WANT
    list.Add(" 13123456");   // DON'T WANT
    list.Add(" 56781312 ");  // DON'T WANT

    foreach (string s in list)
    {
        // Can we reduce this to just one IsMatch() call???
        if (s.Contains("1312") && !(Regex.IsMatch(s, @"\b[0-9]+1312[0-9]+\b") || Regex.IsMatch(s, @"\b[0-9]+1312\b") || Regex.IsMatch(s, @"\b1312[0-9]+\b")))
        {
            Console.WriteLine("'{0}' is a match for '1312'", s);
        }
        else
        {
            Console.WriteLine("'{0}' is NOT a match for '1312'", s);
        }
    }
}

Thank you in advance for any help you can provide!

~Mr. Spock

Upvotes: 0

Views: 229

Answers (5)

Vandesh
Vandesh

Reputation: 6894

For curious minds - Another approach at solving the above problem?

        foreach (string s in list)
        {
            var rgx = new Regex("[^0-9]");
            // Remove all characters other than digits
            s=rgx.Replace(s,"");
            // Can we reduce this to just one IsMatch() call???
            if (s.Contains("1312") && CheckMatch(s))
            {
                Console.WriteLine("'{0}' is a match for '1312'", s);
            }
            else
            {
                Console.WriteLine("'{0}' is NOT a match for '1312'", s);
            }
        }
       private static bool CheckMatch(string s)
       {
            var index = s.IndexOf("1312");
            // Check if no. of characters to the left of '1312' is same as no. of characters to its right
            if(index == s.SubString(index).Length()-4)
               return true;
            return false;
       }

Considering not a match for "131213121312".

Upvotes: 0

jcpdotel
jcpdotel

Reputation: 1

You can select only those letters before, after, or none at all

@"\b[a-z|A-Z]*1312[a-z|A-Z]*\b"

Upvotes: 0

Jerry
Jerry

Reputation: 71538

You can use negative lookarounds for a single check:

@"(?<![0-9])1312(?![0-9])"

(?<![0-9]) makes sure that 1312 doesn't have a digit before it, (?![0-9]) makes sure there's no digit after 1312.

Upvotes: 1

Alireza
Alireza

Reputation: 10476

To catch invalid patterns use:

Regex.IsMatch(s, @"\b[0-9]*1312[0-9]*\b")

Also [0-9] can be replaced with \d

Upvotes: 1

Dave Bish
Dave Bish

Reputation: 19646

You can make the character classes optional matches:

if (s.Contains("1312") && !Regex.IsMatch(s, @"\b[0-9]*1312[0-9]*\b"))
{
    ....

Have a look on the amazing Regexplained: http://tinyurl.com/q62uqr3

Upvotes: 1

Related Questions