bill seacham
bill seacham

Reputation: 395

How to roll my own index in c#?

I need a faster way to create an index file. The application generates pairs of items to be indexed. I currently add each pair as it is generated to a sorted dictionary and then write it out to a disk file. This works well until the number of items added exceeds one million, at which time it slows to the point that is unacceptable. There can be as many as three million data items to be indexed. I prefer to avoid a database because I do not want to significantly increase the size of the deployment package, which is now less than one-half of one megabyte. I tried Access but it is even slower than the sorted dictionary -if it had an efficient bulk load utility then that might work, but I do not find such a tool for Access.

Is there a better way to roll my own index?

Upvotes: 1

Views: 236

Answers (3)

Henk Holterman
Henk Holterman

Reputation: 273244

Is the SortedDictionary really the bottleneck? As compared to the I/O ?
You really should profile this first to prevent optimizing the wrong parts.

But as a tip, when you have 1M or more items it is a good idea to pre-allocate your Dictionary. Give it an initial capacity of 2M or so.

//var index = new SortedDictionary(2 * 1024 * 1024);  // not supported, it's a tree
var index = new SortedList(2 * 1024 * 1024);

If your Dictionary is the problem I would expect it to be from constant re-allocation sooner than from the actual index-searches.

Upvotes: 6

Russ Clarke
Russ Clarke

Reputation: 17909

Just a thought, but could you use an in-memory SQL solution, like SQL Lite ? It's just a small DLL, but will help your priorities, do your logic in C# and do your sorting in SQL.

Have a look here:

http://web.archive.org/web/20100208133236/http://www.mikeduncan.com/sqlite-on-dotnet-in-3-mins/

The Download for SQL Lite itself is only 253k, and the .net bindings are about 75k.

Upvotes: 1

Josh Smeaton
Josh Smeaton

Reputation: 48720

Is SQLite too big to deploy with your software? I'm going to agree with Henk that the constant reallocations within the SortedDictionary are probably the bottleneck. If that solution proves false, try using SQLite to see if that increases performance, and then you can decide on where to go from there.

Upvotes: 0

Related Questions