Reputation: 37710
I'm trying to parse file name and to remove potential number in parenthesis (when having multiple file with same base name), but only the last one
Here are some expected results:
Test
==> Test
Test (1)
==> Test
Test (1) (2)
==> Test (1)
Test (123) (232)
==> Test (123)
Test (1) foo
==> Test (1) foo
I tried to use this regex : (.*)( ?\(\d+\))+
, but the test 1 fails.
I also tried : (.*)( ?\(\d+\))?
but only the 1st test succeed.
I suspect there's something wrong with quantifiers in the regex, but I didn't find exactly what.
How to fix my regex ?
Upvotes: 4
Views: 1006
Reputation: 163632
You could use your first pattern (.*)( ?\(\d+\))+
and replace with the first capturing group only.
To optimize it a bit, you could remove the quantifier +
after the last group and omit the second capturing group.
Then this will remove the last parenthesis with a number between by matching until the end of the string and then backtrack until the last occurrence of parenthesis with a digit.
In the replacement use the first capturing group:
^(.*) \(\d+\)
Explanation
^
Start of string(.*)
Capture group 1, match any char 0+ times (\d+)
Match space, (
1+ digits )
Upvotes: 0
Reputation: 42051
As an alternative you could use an end of string / line anchor:
\s*\(\d+\)$
string resultString = null;
try {
resultString = Regex.Replace(subjectString, @"\s*\(\d+\)$", "", RegexOptions.Multiline);
} catch (ArgumentException ex) {
// Syntax error in the regular expression
}
\s*
*
\(
\d+
+
\)
$
Upvotes: 2
Reputation: 11478
You can avoid Regular Expressions all together, if you simply want the second to you could do:
string example = @"Test (1) (2) (3) (4)";
public string GetPathName(string input)
{
var position = input.LastIndexOf('(');
if(position == -1)
return input;
return example.Substring(0, position);
}
You know that the left parenthesis will always be at the start of the ending name, so why not find the index to that, then grab the rest from position zero? I know you requested Regular Expression, but if you do not need it why over engineer for it?
Upvotes: 0
Reputation: 43199
Just use a neg. lookahead:
\s*\([^()]+\)(?!.*\([^()]+\))
\s* # whitespaces, eventually
\([^()]+\) # (...)
(?!.*\([^()]+\)) # neg. lookahead, no (...) must follow
Upvotes: 3
Reputation: 27743
My guess is that you might likely want to design an expression similar to:
^(.*?)\s*(\(\s*\d+\)\s*)?$
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string pattern = @"^(.*?)\s*(\(\s*\d+\)\s*)?$";
string input = @"Test
Test (1)
Test (1) (2)
Test (1) (2) (3)
Test (1) (2) (3) (4)
";
RegexOptions options = RegexOptions.Multiline;
foreach (Match m in Regex.Matches(input, pattern, options))
{
Console.WriteLine("'{0}' found at index {1}.", m.Value, m.Index);
}
}
}
The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.
jex.im visualizes regular expressions:
Upvotes: 6