Reputation: 1443
We are working with an external accounts program who save "printed documents" on a network share. Each "printed document" in the directory contains 3 files.
We have a customer who has 120,000 files in the directory.
Currently when the user want to see all the "printed documents" the software will loop all the files in the directory and then read each XML file and see if the report is for this user....... this takes 10 mins to read each time.
We are trying to create a faster solution.
The only idea I can think of is to loop the files and put the contents (file name, XML details) into a database table and record the "Last Scanned Date". The next time i loop through the files i can loop through and dismiss any items that are less than the "Last Scanned Date" or use a Linq query!? (borrowed from another post)
DateTime LastCreatedDate = Properties.Settings.Default["LastDateTime"].ToDateTime();
var filePaths = Directory.GetFiles(@"\\Printed\Reports\", "*_*.xml").Select(p => new {Path = p, Date = System.IO.File.GetLastWriteTime(p)})
.OrderBy(x=>x.Date)
.Where(x=>x.Date>=LastCreatedDate);
Is there a quicker solution?
Upvotes: 0
Views: 161
Reputation: 11028
Perhaps it's the parsing of the XML that is taking a long time? You could do a basic "grep" of all the files for the user name/id and then do the actual XML parsing on just the matching files.
Upvotes: 0
Reputation: 205
Based on your use case, it looked like what you were asking for was to have a system where a user could ask for all of their printed documents. I didn't see where date was a part of the solution.
I can think of multiple quick solutions:
NOTE - For ideas 1 and 2, you could process the new files as either part of a service, a task, or whenever a user makes a request for documents.
Upvotes: 0
Reputation: 5550
You could set up a Windows service which detects additions to the folder and then updates the database with the new entries. Thereafter, any queries on documents printed would be at the cost of a database query only.
Upvotes: 1