Reputation: 836
I've searched around a bit, but it's hard to tell if this exact question has been answered before. I know you'll let me know if this is a duplicate.
I have a regular expression that matches a series of one or more positive integers preceded by a backslash. For example: \12345 would match, but \1234f or 12345 would not match.
The regex I'm using is ^\\(\d+)$
When I test the expression using various testers it works. For example, see: http://regex101.com/r/cY2bI1/1
However, when I implement it in the following c# code, I fail to get a match.
The implementation:
public string ParseRawUrlAsAssetNumber(string rawUrl) {
var result = string.Empty;
const string expression = @"^\\([0-9]+)$";
var rx = new Regex(expression);
var matches = rx.Matches(rawUrl);
if (matches.Count > 0)
{
result = matches[0].Value;
}
return result;
}
The failing test (NUnit):
[Test]
public void ParseRawUrlAsAssetNumber_Normally_ParsesTheUrl() {
var f = new Forwarder();
var validRawUrl = @"\12345";
var actualResult = f.ParseRawUrlAsAssetNumber(validRawUrl);
var expectedResult = "12345";
Assert.AreEqual(expectedResult, actualResult);
}
The test's output:
Expected string length 5 but was 6. Strings differ at index 0.
Expected: "12345"
But was: "\\12345"
-----------^
Any ideas?
Resolution:
Thanks everyone for the input. In the end I took the following route based on your recommendations, and it is passing tests now.
public string ParseRawUrlAsAssetNumber(string rawUrl)
{
var result = string.Empty;
const string expression = @"^\\([0-9]+)$";
var rx = new Regex(expression);
var matches = rx.Matches(rawUrl);
if (matches.Count > 0)
{
result = matches[0].Groups[1].Value;
}
return result;
}
Upvotes: 1
Views: 2407
Reputation: 148980
The problem is this line:
var rx = new Regex(Regex.Escape(expression));
By escaping your expression you're turning all your special regex characters into literals. Calling Regex.Escape(@"^\\(\d+)$")
will return "\^\\\\\(\\d\+\)\$"
, which will only match the literal string "^\\(\d+)$"
Try just this:
var rx = new Regex(expression);
See MSDN: Regex.Escape
for a full explanation and examples of how this method is intended to be used.
Given your updated question, it seems like you also have an issue here:
result = matches[0].Value;
This will return the entire matched substring, not just the first capture group. For that you'll have to use:
result = matches[0].Groups[1].Value;
Upvotes: 7
Reputation: 236188
Don't escape pattern. Also simply use Regex.Match
thus you are going to have single match here. Use Match.Success
to check if input matched your pattern. And return group value - digits are in group of your matched expression:
public string ParseRawUrlAsAssetNumber(string rawUrl)
{
const string pattern = @"^\\(\d+)$";
var match = Regex.Match(rawUrl, pattern);
if (!match.Success)
return String.Empty;
return match.Groups[1].Value;
}
Upvotes: 3
Reputation: 5750
What if you try get the group result instead?
match.Groups[1].Value
When I get to a real computer I'll test, but seems like it should work
Upvotes: 1