Reputation: 1550
I want to use regex to get a filename without extension. I'm having trouble getting regex to return a value. I have this:
string path = @"C:\PERSONAL\TEST\TESTFILE.PDF";
var name = Regex.Match(path, @"(.+?)(\.[^\.]+$|$)").Value;
In this case, name
always comes back as C:\PERSONAL\TEST\TESTFILE.PDF
. What am I doing wrong, I think my search pattern is correct?
(I am aware that I could use Path.GetFileNameWithoutExtension(path);
but I specifically want to try using regex)
Upvotes: 1
Views: 2617
Reputation: 22896
Can be a bit shorter and greedier:
var name = Regex.Replace(@"C:\PERS.ONAL\TEST\TEST.FILE.PDF", @".*\\(.*)\..*", "$1"); // "TEST.FILE"
Upvotes: 1
Reputation: 31721
Since the data is on the right side of the string, tell the regex parser to work from the end of the string to the beginning by using the option RightToLeft
. Which will significantly reduce the processing time as well as lessen the actual pattern needed.
The pattern below reads from left to right and says, give me everything that is not a \
character (to consume/match up to the slash and not proceed farther) and start consuming up to a period.
Regex.Match(@"C:\PERSONAL\TEST\TESTFILE.PDF",
@"([^\\]+)\.",
RegexOptions.RightToLeft)
.Groups[1].Value
Prints out
TESTFILE
Upvotes: 1
Reputation: 982
You need Group[1].Value
string path = @"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, @"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
var name = match.Groups[1].Value;
}
match.Value
returns the Captures.Value
which is the entire match
match.Group[0]
always has the same value as match.Value
match.Group[1]
return the first capture value
For example:
string path = @"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, @"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
Console.WriteLine(match.Value);
// return the substring of the matching part
//Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF
Console.WriteLine(match.Groups[0].Value)
// always the same as match.Value
//Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF
Console.WriteLine(match.Groups[1].Value)
// return the first capture group which is (.+?) in this case
//Output: C:\\PERSONAL\\TEST\\TESTFILE
Console.WriteLine(match.Groups[2].Value)
// return the second capture group which is (\.[^\.]+$|$) in this case
//Output: .PDF
}
Upvotes: 1
Reputation: 9
Try this:
.*(?=[.][^OS_FORBIDDEN_CHARACTERS]+$)
For Windows:
OS_FORBIDDEN_CHARACTERS = :\/\\\?"><\|
this is a sleight modification of: Regular expression get filename without extention from full filepath
If you are fine to match forbidden characters then simplest regex would be:
.*(?=[.].*$)
Upvotes: 0