Reputation: 8922
I am using Linq to Sql. Here is the code:
Dictionary<string, int> allResults;
using (var dc= new MyDataContext())
{
dc.CommandTimeout = 0;
allResults = dc.MyTable.ToDictionary(x => x.Text, x => x.Id);
}
it is ran on a 64 bit machine and the compilation is AnyCPU. It throws a System.OutOfMemoryException.
This accesses an SQL Server database. The Id field maps to an SQL Server int field, And the Text field maps to Text(nvarchar(max)) field. Running select COUNT(*) from TableName
results in 1,173,623 records and running select sum(len(Text)) from TableName
results in 48,915,031. Since int is a 32 bit integer, the ids should take only 4.69MB of space and the strings less than 1GB. So we are not even bumping against the 2GB/object limit.
I then change the code in this way:
Dictionary<string, int> allResults;
using (var dc = new MyDataContext())
{
Dictionary<string, int> threeHundred;
dc.CommandTimeout = 0;
var tot = dc.MyTable.Count();
allResults = new Dictionary<string, int>(tot);
int skip = 0;
int takeThis = 300000;
while (skip < tot)
{
threeHundred = dc.MyTable.Skip(skip).Take(takeThis).ToDictionary(x => x.Text, x => x.Id);
skip = skip + takeThis;
allResults = allResults.Concat(threeHundred).ToDictionary(x => x.Key, x => x.Value);
threeHundred = null;
GC.Collect();
}
}
I learn that garbage collaction here does not help and that the out of memory exception is thrown on the first line in the while loop once skip = 900,000.
What is wrong and how do I fix this?
Upvotes: 2
Views: 1708
Reputation: 501
Without getting into your calculations of how much it should take in memory (as there could be issues of encoding that could easily double the size of the data), I'll try to give a few pointers.
Starting with the cause of the issue - my guess is that the threeHundred dictionary is causing a lot of allocations. When you add items to a dictionary like above, the dictionary won't be able to know how many items it should pre-allocated. Which will cause a massive re-allocation and coping of all data to newly created dictionaries. Please set a size (using the ctor) to the threeHundred dictionary before adding any items to it.
Please read this article I've published which goes in-depth into Dictionary internals - I'm sure it will shed some light on those symptoms. http://www.codeproject.com/Articles/500644/Understanding-Generic-Dictionary-in-depth
In addition, when trying to populate this large amount of data, I suggest to fully control the process. My suggestion:
Either way, you should check if you are running on 32 bit or 64 bit. .Net 4.5 prefers 32 bit. (check it on Task Manager or the project properties)
Hope this helps, Ofir.
Upvotes: 1