sumesh ghimire
sumesh ghimire

Reputation:

extract fileName using Regex

If I want to match only fileName, i.e,

in C://Directory/FileName.cs, somehow ignore everything before FileName.cs using Regex.

How can I do it?

I need this for a Compiled UI I am working on ... can't use programming language as it only accepts Regex.

Any ideas?

Upvotes: 12

Views: 46422

Answers (11)

Oybek Odilov
Oybek Odilov

Reputation: 331

Try this (working with / and \):

[^\/|\\]*$

Upvotes: 1

Christian
Christian

Reputation: 257

The regex expression that did it for me was

[^\/]*$

Upvotes: 4

Ste
Ste

Reputation: 2293

Seeing as filename can be interpreted as the basename by some. Then, this example can extract the filename/basename for any files that may not have an extension for some reason. It can also get the last directory in the same fashion.

You can see how it works and test it here. https://regexr.com/4ht5v

The regexp is:

.+?\\(?=\w+)|\.\w+$|\\$


Before:

C:\Directory\BaseFileName.ext

C:\Directory\BaseFileName

C:\This is a Directory\Last Directory With trailing backslash\

C:\This is a Directory\Last Directory Without trailing backslash

After:

BaseFileName

BaseFileName

Last Directory With trailing backslash

Last Directory Without trailing backslash

For the sake of completion, this is how it would work with JavaScript should anyone require it.

// Example of getting a BaseFileName from a path

var path = "C:\\Directory\\FileName.cs";
var result = path.replace(/.+?\\(?=\w+)|\.\w+$|\\$/gm,"");
console.log(result);

Upvotes: 1

InterruptedException
InterruptedException

Reputation: 407

A rather elegant solution with lookahead and lookbehind wasn't mentioned:

(?<=.+)(?=.cs)

Upvotes: 0

spugm1r3
spugm1r3

Reputation: 3649

I'm way late to the party and I'm also ignoring the requirement of regex because, as J-16 SDiZ pointed out, sometimes there is a better solution. Even though the question is 4 years old, people looking for a simple solution deserve choices.

Try using the following:

public string ConvertFileName(string filename)
    {
        string[] temparray = filename.Split('\\');
        filename = temparray[temparray.Length - 1];
        return filename;
    }

This method splits the string on the "\" character, stores the resulting strings in an array and returns the last element of the array (the filename).

Though the OP seems to be writing for UNIX, it doesn't take much to figure out how to tailor it to your particular need.

Upvotes: 1

Nischal
Nischal

Reputation: 1

Suppose the file name has special characters, specially when supporting MAC where special characters are allowing in filenames, server side Path.GetFileName(fileName) fails and throws error because of illegal characters in path. The following code using regex come for the rescue.

The following regex take care of 2 things

  1. In IE, when file is uploaded, the file path contains folders aswell (i.e. c:\samplefolder\subfolder\sample.xls). Expression below will replace all folders with empty string and retain the file name

  2. When used in Mac, filename is the only thing supplied as its safari browser and allows special chars in file name.

    var regExpDir = @"(^[\w]:\\)([\w].+\w\\)";
    
    var fileName = Regex.Replace(fileName, regExpDir, string.Empty);
    

Upvotes: 0

Victor
Victor

Reputation: 5807

just a variation on miky's that works for both filesystem path characters: [^\\/]*\s

Upvotes: 0

Peter Boughton
Peter Boughton

Reputation: 112160

Based on your comment of needing to exclude paths that do not match 'abc', try this:

^.+/(?:(?!abc)[^/])+$


Completely split out in regex comment mode, that is:

(?x)     # flag to enable comments
^        # start of line

.+       # match any character (except newline)
         #   greedily one or more times
/        # a literal slash character

(?:      # begin non-capturing group
  (?!      # begin negative lookahead
           # (contents must not appear after the current position)
    abc      # literal text abc
  )        # end negative lookahead
  [^/]     # any character that is not a slash
)        # end non-capturing group
+        # repeat the above nc group one or more times
         #   (essentially, we keep looking for non-backspaces that are not 'abc')

$        # end of line

Upvotes: 5

jpbaudoin
jpbaudoin

Reputation: 81

I would use: ./(.$)

The parenthesis mark a group wich is the file name. The regular expression you use may vary dependig on the regex syntax(PCRE, POSIX)

I sugest you use a regex tool, there are several for windows and linux:

Windows - http://sourceforge.net/projects/regexcreator/

Windows - http://weitz.de/regex-coach/

Linux - kodos

Hope it helps

Upvotes: 0

Mike Dinescu
Mike Dinescu

Reputation: 55720

Something like this might work:

[^/]*$

It matches all characters to the end of the line that are not "/"..

If you want to match paths that use the "\" path separator you would change the regex to:

[^\]*$

But do make sure to escape the "\" character if your programming language or environment requires it. For instance you might have to write something like this:

[^\\]*$

EDIT I removed the leading "/" and trailing "/" as they may be confusing since they are not really part of the regEx but they are very common of representing a regular expression.

And of course, depending on the features that the regEx engine supports you may be able to use look-ahead/look-behind and capturing to craft a better regEx.

Upvotes: 20

Tom Leys
Tom Leys

Reputation: 19029

What language are you using? Why are you not using the standard path mechanisms of that language?

How about http://msdn.microsoft.com/en-us/library/system.io.path.aspx ?

Upvotes: 7

Related Questions