javier_el_bene
javier_el_bene

Reputation: 450

Using Directory.GetFiles() to select all the files but a certain extension

I found several questions on Stack Overflow about the Directory.GetFiles() but in all of the cases, they explain how to use it to find a specific extension or a set of files through multiple criteria. But in my case, what i want is get a search pattern for Directory.GetFiles() using regular expressions, which return all of the files of the directory but the set that i'm specifying. I mean not declare the set that i want but the difference. For example, if i want all of the files of a directory but not the htmls. Notice that, i',m know it could be achieve it in this way:

var filteredFiles = Directory
.GetFiles(path, "*.*")
.Where(file => !file.ToLower().EndsWith("html")))
.ToList();

But this is not a very reusable solution, if later i want to filter for another kind of file i have to change the code adding an || to the Where condition. I'm looking for something that allows me create a regex, which consist in the files that i don't want to get and pass it to Directory.GetFiles(). So, if i want to filter for more extensions later, is just changing the regex.

Upvotes: 0

Views: 1622

Answers (3)

Tim Schmelter
Tim Schmelter

Reputation: 460108

You don't need a regex if you want to filter extension(s):

// for example a field or property in your class
private HashSet<string> ExtensionBlacklist { get; } =
    new HashSet<string>(StringComparer.InvariantCultureIgnoreCase)
    {
        ".html",
        ".htm"
    };
// ...

var filteredFiles = Directory.EnumerateFiles(path, "*.*")
    .Where(fn => !ExtensionBlacklist.Contains(System.IO.Path.GetExtension(fn)))
    .ToList();

Upvotes: 7

rory.ap
rory.ap

Reputation: 35270

I would recommend against using regex in favor of something like this:

var filteredFiles = Directory
    .GetFiles(path, "*.*")
    .Where(file => !excludedExtensions.Any<string>((extension) => 
    file.EndsWith(extension, StringComparison.CurrentCultureIgnoreCase)))
    .ToList();

You can pass it a collection of your excluded extensions, e.g.:

var excludedExtensions = new List<string>(new[] {".html", ".xml"});

The Any will short-circuit as soon as it finds a match on an excluded extension, so I think this is preferable even to excludedExtensions.Contains(). As for the regex, I don't think there's a good reason to use that given the trouble it can buy you. Don't use regex unless it's the only tool for the job.

Upvotes: 1

Joey
Joey

Reputation: 354516

So essentially you just don't know how to perform a regex match on a string?

There is Regex.IsMatch for that very purpose. However, you could also change the code to look up the extension in a set of extensions to filter, which would also allow you to easily add new filters.

Upvotes: 0

Related Questions