Reputation: 4104
I'm building an application that'll copy files from one server to another. I have these locations mapped as network drives so DeployFrom
is something like Z:\MyPath\Subpath
.
The actual code to get all the files:
List<string> files = Directory.GetFiles(DeployFrom, "*.aspx", SearchOption.AllDirectories).ToList();
files.AddRange(Directory.GetFiles(DeployFrom, "*.ascx", SearchOption.AllDirectories));
files.AddRange(Directory.GetFiles(DeployFrom, "*.css", SearchOption.AllDirectories));
files.AddRange(Directory.GetFiles(DeployFrom, "*.htm", SearchOption.AllDirectories));
files.AddRange(Directory.GetFiles(DeployFrom, "*.html", SearchOption.AllDirectories));
files.AddRange(Directory.GetFiles(DeployFrom, "*.js", SearchOption.AllDirectories));
But it runs extremely slow as there are around 2 GB / 13.6K files in DeployFrom
.
I found this suggestion in a similar SO post about using GetFiles
for multiple types:
Directory.GetFilesDeployFrom, "*.*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".aspx") || s.EndsWith(".css") || s.EndsWith(".htm") || s.EndsWith(".html") || s.EndsWith(".js")).ToList();
But this still takes about 2 minutes to build a list of 1800 files.
Is there a faster way, without hardcoding a list of folders to specifically check?
Only other option I can think of is to use Directory.GetDirectories()
and filter out a blacklist of folders I know I don't care about, then iterate over that collection and call that second snippet of code above using SearchOption.TopDirectoryOnly
instead. I don't want to hardcode "good" folders to check, because if new "good" folders get added then we'll have to come and add them to this utility too. But even still, this won't cut down on the number of files checked very much. Most of the "bad" folders just have large files, which I don't think effects the runtime.
Upvotes: 0
Views: 61
Reputation: 407
I would recommend trying Directory.EnumerateFiles() instead of Directory.GetFiles(). Based on my code test below I was able to enumerate through all files substantially quicker with Directory.EnumerateFiles().
Reference: https://msdn.microsoft.com/en-us/library/ff462679(v=vs.110).aspx
Also, a possibility could be to run a search asynchronously for each sub-directory?
static void Main(string[] args) {
Stopwatch sw = new Stopwatch();
sw.Start();
var subDirectories = Directory.GetFiles(@"C:\Users\ertdiddy\Documents\Visual Studio 2013\Projects", "*.*", SearchOption.AllDirectories);
sw.Stop();
var getFileTime = sw.Elapsed.TotalSeconds;
sw.Reset();
Console.WriteLine(getFileTime);
sw.Start();
var subDirectories2 = Directory.EnumerateFiles(@"C:\Users\ertdiddy\Documents\Visual Studio 2013\Projects", "*.*", SearchOption.AllDirectories);
sw.Stop();
var enumErateFileTime = sw.Elapsed.TotalSeconds;
Console.WriteLine(enumErateFileTime);
//Output:
//GetFiles() = 0.499075 seconds
//EnumerateFiles() = 0.0001175 seconds
}
Use LINQ as you had suggested to filter your files
Directory.EnumerateFiles(@"C:\Users\ertdiddy\Documents\Visual Studio 2013\Projects", "*.*", SearchOption.AllDirectories).Where(fileType => fileType.EndsWith(".cs") || fileType.EndsWith(".dll"));
Side note: My directory contained about 17.5k files (2.65GB).
Upvotes: 2