Joe
Joe

Reputation: 836

Regular Expression working in regex tester, but not in c#

I've searched around a bit, but it's hard to tell if this exact question has been answered before. I know you'll let me know if this is a duplicate.

I have a regular expression that matches a series of one or more positive integers preceded by a backslash. For example: \12345 would match, but \1234f or 12345 would not match.

The regex I'm using is ^\\(\d+)$

When I test the expression using various testers it works. For example, see: http://regex101.com/r/cY2bI1/1

However, when I implement it in the following c# code, I fail to get a match.

The implementation:

public string ParseRawUrlAsAssetNumber(string rawUrl) {
    var result = string.Empty;
    const string expression = @"^\\([0-9]+)$";
    var rx = new Regex(expression);
    var matches = rx.Matches(rawUrl);
    if (matches.Count > 0)
    {
        result = matches[0].Value;
    }
    return result;
}

The failing test (NUnit):

[Test]
public void ParseRawUrlAsAssetNumber_Normally_ParsesTheUrl() {
    var f = new Forwarder();
    var validRawUrl = @"\12345";
    var actualResult = f.ParseRawUrlAsAssetNumber(validRawUrl);
    var expectedResult = "12345";
    Assert.AreEqual(expectedResult, actualResult);
}

The test's output:

Expected string length 5 but was 6. Strings differ at index 0.
Expected: "12345"
But was:  "\\12345"
-----------^

Any ideas?

Resolution:

Thanks everyone for the input. In the end I took the following route based on your recommendations, and it is passing tests now.

public string ParseRawUrlAsAssetNumber(string rawUrl)
{
    var result = string.Empty;
    const string expression = @"^\\([0-9]+)$";
    var rx = new Regex(expression);
    var matches = rx.Matches(rawUrl);
    if (matches.Count > 0)
    {
        result = matches[0].Groups[1].Value;
    }
    return result;
}

Upvotes: 1

Views: 2407

Answers (3)

p.s.w.g
p.s.w.g

Reputation: 148980

The problem is this line:

var rx = new Regex(Regex.Escape(expression));

By escaping your expression you're turning all your special regex characters into literals. Calling Regex.Escape(@"^\\(\d+)$") will return "\^\\\\\(\\d\+\)\$", which will only match the literal string "^\\(\d+)$"

Try just this:

var rx = new Regex(expression);

See MSDN: Regex.Escape for a full explanation and examples of how this method is intended to be used.


Given your updated question, it seems like you also have an issue here:

result = matches[0].Value;

This will return the entire matched substring, not just the first capture group. For that you'll have to use:

result = matches[0].Groups[1].Value;

Upvotes: 7

Sergey Berezovskiy
Sergey Berezovskiy

Reputation: 236188

Don't escape pattern. Also simply use Regex.Match thus you are going to have single match here. Use Match.Success to check if input matched your pattern. And return group value - digits are in group of your matched expression:

public string ParseRawUrlAsAssetNumber(string rawUrl)
{            
    const string pattern = @"^\\(\d+)$";

    var match = Regex.Match(rawUrl, pattern);
    if (!match.Success)
        return String.Empty;

    return match.Groups[1].Value;
}

Upvotes: 3

Kyle Gobel
Kyle Gobel

Reputation: 5750

What if you try get the group result instead?

match.Groups[1].Value

When I get to a real computer I'll test, but seems like it should work

Upvotes: 1

Related Questions