Reputation: 4775
I am currently creating a file scanner that enumerates files based on a certain criteria. One of the scan options is to exclude files that are larger than a pre-defined size. This rule can be applied to individual directories and their children.
For example, a user can specify a rule to take only files that are smaller than 1GB from C:\Users{USERNAME}\Documents. So if the user decided to scan a directory inside the documents folder lets say: C:\Users\{USERNAME}\Documents\SOMEDIR1\SOMEDIR2\
the specified rule should apply to that directory and only files that are less than or equal to 1GB in size should be populated.
Currently I am storing the rules in a Dictionary defined as Dictionary<string, long> dSizeLimit;
where the key is the full directory path and the value is the rule's file size in bytes.
Currently I am using the following method to determine if a file should be omitted from the populated files list:
public void SearchDirectory(DirectoryInfo dir_info, List<string> file_list, ref long size, ScanOptions Opt = null)
{
if (Opt == null)
Opt = DefaultOption;
try
{
foreach (DirectoryInfo subdir_info in dir_info.GetDirectories())
{
SearchDirectory(subdir_info, file_list, ref size, Opt);
}
}
catch(Exception ex)
{
Console.WriteLine("Failed to enumerate directory: {0}", dir_info.FullName);
Console.WriteLine("Exception: {0}", ex.Message);
}
try
{
foreach (FileInfo file_info in dir_info.GetFiles())
{
//Here I iterate over all the size rules to determine if the current file should be added to the file_list
foreach (KeyValuePair<string,long> entry in Opt.dSizeLimit)
{
if(string.Compare(entry.Key, 0, file_info.FullName, 0, entry.Key.Length, true)==0)
{
if (entry.Value > 0 && file_info.Length > entry.Value)
continue;
}
}
file_list.Add(file_info.FullName);
size += file_info.Length;
}
}
catch(Exception ex)
{
Console.WriteLine("Failed to enumerate directory: {0}", dir_info.FullName);
Console.WriteLine("Exception: {0}", ex.Message);
}
}
ScanOptions is a struct that contains all the scan rules including the size rule.
As you can see from the code, I currently iterate over all the rules to determine if the current file should be included in the file list. This can prove fatal since the number of entries in the dSizeLimit
Dictionary is not limited as the user can add what ever rules he wants.
So is there a better way to handle such lookup?
P.S. Please note that my target framework should be .NET 2.0, so LINQ and any other non 2.0 friendly namespaces are out of the question.
Upvotes: 0
Views: 119
Reputation: 57182
If rules are applied on a directory basis, then you could determine the most restrictive rule before iterating on the files, something like this:
long maxSize = long.MaxValue;
foreach (KeyValuePair<string,long> entry in Opt.dSizeLimit) {
if(dir_info.FullName.StartsWith(entry.Key)) {
maxSize = Math.Min(maxSize, entry.Value);
}
}
// now iterate on the files, if no rules were present, file size
// should always be < long.MaxValue
There is no reason (if I understood correctly) to re-scan the rules each time for files that are in the same folder, so this should save quite a lot of operations.
To avoid iteration on the dictionary, you could have the options struct with just one value, then when you iterate folders you construct the struct with the appropriate value, something like this (pseudo code, just to give you the idea):
foreach (DirectoryInfo subdir_info in dir_info.GetDirectories()) {
ScanOptions optForSubFolder = Opt;
if (/* more specific rules for given subfolder */) {
optForSubFolder.SizeLimit = /* value for subfolder */;
}
SearchDirectory(subdir_info, file_list, ref size, optForSubFolder);
}
Upvotes: 1