Matt Brailsford
Matt Brailsford

Reputation: 2237

Best searchable collection strategy for performance?

I've got a collection of objects with the following interface:

public IEntity
{
    public string Key1 { get; set; }
    public string Key2 { get; set; }
    ... some other properties
}

and am looking for the best strategy for querying an in memory collection of these objects via linq. The majority of queries (but not all) are likely to look for Key1 or Key2 to access the entity, so I'm not sure what is the most performant way to query them. My thought are:

IList< IEntity>

Just stick them in a list an use linq to filter them

IDictionary< Tuple< string, string>, IEntity>

Create a mult-key dictionary using key1 and key2, though I'm not sure how I could access the IEntity if I only know one part?

Something else

Is there some other, better way to achieve this?

Upvotes: 1

Views: 202

Answers (4)

Jodrell
Jodrell

Reputation: 35716

So, I have an IEnumerable<IEntity>, if the keys are independently unqiue then its simple,

IEnumerable<IEntity> entities = ...

var byKey1 = entities.ToDictionary(e => e.Key1);
var byKey2 = entities.ToDictionary(e => e.Key2);

If they are not,

var byKey1 = entities.ToLookup(e => e.Key1);
var byKey2 = entities.ToLookup(e => e.Key2);

Then, if you have both keys,

var match = byKey1[key1].Intersect(byKey2[key2]);

Upvotes: 0

Jon
Jon

Reputation: 437376

For fast lookups based on keys you cannot do better than an associative container: either a hashtable such as Dictionary or a tree-based structure such as SortedDictionary. In the relatively uncommon case that your data structure is built once from sorted input and modified rarely, consider also SortedList. All of these have different performance characteristics, so the choice depends on the particulars.

If your keys have different types then you would practically have to go with multiple such containers, but here you can simply use just one and give each "type of key" a unique prefix. For example, you could decide to do this:

var dict = new Dictionary<string, IEntity>();
var entity = (IEntity)whatever;

dict.Add("key1:" + entity.Key1, entity);
dict.Add("key2:" + entity.Key2, entity);

// and now find by either Key1 or Key2 by using the same prefix

If the keys are not guaranteed to be unique then you would need a "MultiDictionary" or equivalent class, in which case you should take a look at the question multimap in .NET.

Upvotes: 2

Noel Kennedy
Noel Kennedy

Reputation: 12258

A few things could work :

  • If you can accept the performance of just using a list and scanning through them, you are done!
  • You could use 2+ dictionaries : IDictionary<string,List<IEntity>>. Dictionary1 keyed on Key1, Dictionary2 keyed on Key2 etc. Store all the entities in a list which have that key. Accept poorer performance for lookups based on attributes you haven't indexed via a dictionary.
  • Maybe use a trie data structure.

Upvotes: 0

Kristian Schneider
Kristian Schneider

Reputation: 175

You list will take O(n) to search whereas the dictionary should take O(1) with a strain on in memory size. So your dictionary approach will be quickest

Upvotes: 0

Related Questions