Hooplator15
Hooplator15

Reputation: 1550

C# Regex to Get file name without extension?

I want to use regex to get a filename without extension. I'm having trouble getting regex to return a value. I have this:

string path = @"C:\PERSONAL\TEST\TESTFILE.PDF";
var name = Regex.Match(path, @"(.+?)(\.[^\.]+$|$)").Value;

In this case, name always comes back as C:\PERSONAL\TEST\TESTFILE.PDF. What am I doing wrong, I think my search pattern is correct?

(I am aware that I could use Path.GetFileNameWithoutExtension(path);but I specifically want to try using regex)

Upvotes: 1

Views: 2617

Answers (4)

Slai
Slai

Reputation: 22896

Can be a bit shorter and greedier:

var name = Regex.Replace(@"C:\PERS.ONAL\TEST\TEST.FILE.PDF", @".*\\(.*)\..*", "$1"); // "TEST.FILE"

Upvotes: 1

ΩmegaMan
ΩmegaMan

Reputation: 31721

Since the data is on the right side of the string, tell the regex parser to work from the end of the string to the beginning by using the option RightToLeft. Which will significantly reduce the processing time as well as lessen the actual pattern needed.

The pattern below reads from left to right and says, give me everything that is not a \ character (to consume/match up to the slash and not proceed farther) and start consuming up to a period.

Regex.Match(@"C:\PERSONAL\TEST\TESTFILE.PDF", 
            @"([^\\]+)\.", 
            RegexOptions.RightToLeft)
      .Groups[1].Value

Prints out

TESTFILE

Upvotes: 1

Liu
Liu

Reputation: 982

You need Group[1].Value

string path = @"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, @"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
    var name = match.Groups[1].Value;
}

match.Value returns the Captures.Value which is the entire match

match.Group[0] always has the same value as match.Value

match.Group[1] return the first capture value

For example:

string path = @"C:\PERSONAL\TEST\TESTFILE.PDF";
var match = Regex.Match(path, @"(.+?)(\.[^\.]+$|$)");
if(match.Success)
{
    Console.WriteLine(match.Value);
    // return the substring of the matching part
    //Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF 
    Console.WriteLine(match.Groups[0].Value)
    // always the same as match.Value
    //Output: C:\\PERSONAL\\TEST\\TESTFILE.PDF 
    Console.WriteLine(match.Groups[1].Value)
    // return the first capture group which is (.+?) in this case
    //Output: C:\\PERSONAL\\TEST\\TESTFILE 
    Console.WriteLine(match.Groups[2].Value)
    // return the second capture group which is (\.[^\.]+$|$) in this case
    //Output: .PDF 

}

Upvotes: 1

Tobyash
Tobyash

Reputation: 9

Try this:

.*(?=[.][^OS_FORBIDDEN_CHARACTERS]+$)

For Windows:

OS_FORBIDDEN_CHARACTERS = :\/\\\?"><\|

this is a sleight modification of: Regular expression get filename without extention from full filepath

If you are fine to match forbidden characters then simplest regex would be:

.*(?=[.].*$)

Upvotes: 0

Related Questions