Reputation: 97
So this is probably horribly inefficient but I am trying to find a way to build list of files in a directory (Their are 10's of thousands), I abstract information from that file, than I build a cache file so that I will only check NEW files for this information.
What I am doing right now is in the Properties.Settings.Default.FileCache I have a String Collection and I am running through my application like this
Parsing Process:
1- Iterate through all folders and folders to build file list
2- Reload cache file and compare (Explained later since it probably makes more sense to explain how I am building it in the first place before I explain how I am comparing)
3- Parse the information I want from new files
4- Properties.Settings.Default.Add (FileName + "|" Information1 + | Information2)
Reloading Cache and comparing:
1- Split three values into a List
2- If the File Exists on the Cache List I remove it from the New List
3- For any remaining files I go to STEP 3 above.
This seems horribly inefficient. But I am new to C# and it is the only method I have come up with so far.
Upvotes: 1
Views: 1098
Reputation: 134005
Seems like you can save yourself a little trouble by loading the cache first and create a HashSet<string>
containing all of the file names that already exist in the cache.
Then iterate through the the folders. For each file, first see if it's in the cache. If it's not in the cache, then parse the information you want and add that name to the cache.
That way the amount of information you're holding in memory is smaller (i.e. you don't have to keep all of the file names around), and you look at a file one time. If it's already in the cache, then ignore it. If it's not in the cache, extract the information you want and add to the cache. Then move on.
Unless you can be notified somehow of new files (for example, your program is always running and has a FileSystemWatcher
monitoring the directory), that's the best you can do.
Upvotes: 1