Reputation: 1937
Design: My C# WCF Process has to cache up huge amounts of data in-memory (as Dictionary) - Memory taken by process grows over 1.5GB. Data in Cache is more or less stuff from Database (using Entity Framework). The way cache is built is: select query on a table to get list of Primary Key from the table (say list of string). Suppose I get a list of 1000 items. Now I do Parallel.Foreach on this List of Primary keys and the (body of foreach) operation is to go to the DB and fetch all data for this key (i.e. select * from table where KeyColumn = loop item). Apply some operations on the data and then add it into the Cache (Dictionary).
Problem: When the process/executable starts. It consumes almost 95% CPU (which is good) and is hogging RAM (say uptil 1.3 or 1.4 GB). Would run fine till first 10-12 minutes. But then for no known reason, CPU is down to 15-17% RAM steady at 1.4GB (some more to go). And I can see several items from the DB are yet to be added to the Cache. This Frozen kind of state continues for painfully long time (at times 10 Hours) and then everything would process and all data is in my Cache. RAM now steady at 1.5GB or so. I thought GC cycle would have frozen the Application threads but then (since its a WCF Service) any Service method calls do respond. Its only that Parallel Thread part which seems Frozen every time, every restart at the same RAM size..and Data wise same set of items missing from cache every time. I have verified that there is nothing different in the data for those notoriusly missing keys.
Looking for any pointers on what seems to be wrong?
In Simple Terms my code flows like below:
ConcurrentDictionary<string, string> MyCache = new ConcurrentDictionary<string, string>();
private List<string> GetPrimaryKeysFromDB()
{
using(var ctx = new MyDBContext())
{
List<string> results = ctx.MyTable.Select(x=>x.PrimeColumn).ToList();
return results;
}
}
private void SomeMethod()
{
List<string> ListOfPrimeItems = GetPrimaryKeysFromDB();
Parallel.Foreach(ListOfPrimeItems, #MaxDopSetting#, k =>
{
ProcessDataForKey(k);
});
}
private void ProcessDataForKey(string key)
{
// Goto DB and fetch record for key
// Each column (Entity data member) will undergo some processing here
// some string manipulations
// Finally convert the new state of data to XML (serialize) and store in cache
MyCache[key] = TranslatedStateOfData;
}
Upvotes: 2
Views: 245
Reputation: 1937
Writing this update, so that someone else may benefit from this. Task Parallel library was flawless in my case. Issue was at one of my data crunching steps. I am using regex and one of my regex suffered from "Catastrophic Backtracking"
I fixed the Regex and it works blazing fast (within minutes). Thank you everyone for suggestions even though I posted wrong problem. Feels silly to have missed such a minor bug.
Upvotes: 2