Grant H.
Grant H.

Reputation: 3717

Efficiently Filtering Large Sets of Multi-Key Objects

I have a number of items, each of which has a number of Codes associated with it. Each item will have a collection of these.

List<ComplexObject> myCollection = new List<ComplexObject>();
var item = new ComplexObject() 
           { 
               Id = 5, 
               Codes = new List<string>() { "1", "4", "5" } 
           }; 
myCollection.Add(item);

I need to filter these objects by code, on demand. There will be many entries, potentially >1,000,000, each containing a collection of associated keys.

I can obviously use something like:

var output = (from p in myCollection
             from q in p.Codes
             where q == x
             select p).ToList();

But, I want to be able to filter by code without having to loop through all objects - which is what is going on here. Performance is critical in this context, so I want to achieve a better than O(n) lookup time on these objects.

I've considered creating a dictionary that has each Code as a key, and a collection of ComplexObjectIds, but this seems a bit clunky - but perhaps it's the best option. In my head I envision some kind of web network collection...

Given that performance is my ultimate goal, is there a more optimal approach?

Upvotes: 1

Views: 90

Answers (1)

spender
spender

Reputation: 120508

So create an ILookup using the .ToLookup extension method:

var lookup = myCollection
  .SelectMany(co => co.Codes.Select(code => new{code, co}))
  .ToLookup(x => x.code, x => x.co);

so now you can query the lookup

IEnumerable<ComplexObject> complexObjs = lookup["1"];  

in O(1) time.

Upvotes: 1

Related Questions