Reputation: 12087
I am looking to create a fast lookup (dictionary?) with optional keys so for example, let's say I have 3 keys: "first_name", "last_name", "zipcode"
so I would like to be able to do the following (pseudocode):
GetValue(first_name) -- would return a list of everyone with that first name
GetValue(first_name, last_name) -- would return a list of everyone with that first name & last name
GetValue(zipcode, first_name) -- would return a list of everyone with that first_name in the specified zipcode
I should be able to query out all permutations of those keys. What data structure would you use for this? How would you implement this?
Upvotes: 3
Views: 4881
Reputation: 64943
You could still use regular dictionaries where the key could be a custom type like this:
public class CompositeKey
{
public CompositeKey(string firstName, string lastName, string zipCode)
{
FirstName = firstName;
LastName = lastName;
ZipCode = zipCode;
}
public string FirstName { get; }
public string LastName { get; }
public string ZipCode { get; }
}
Now I would override Equals
and GetHashCode
on CompositeKey
to provide what makes a composite key unique so Dictionary<TKey, TValue>
would be able to store unique composite keys.
Finally, I would be able to query the dictionary this way:
var value = dict[new CompositeKey(firstName: "Matías", lastName: "Fidemraizer" )];
OP asked this question in some comment:
I thought about this approach but how would you query the dictionary for "FirstName = "Matias" only?
Since you're overriding both Equals
and GetHashCode
, you can add all combinations as keys in the whole dictionary and they can all co-exist there:
Person person = new Person { /* Set members here */ }
// Note that I'll add many keys that have the same value
dict.Add(new CompositeKey(name: "Matías"), person);
dict.Add(new CompositeKey(lastName: "Fidemraizer"), person);
dict.Add(new CompositeKey(firstName: "Matías", lastName: "Fidemraizer"), person);
Each key will result in a different hash code so they can all co-exist in the same dictionary and they will provide a powerful tool to query by many criterias and criteria combinations.
Other approach could be using multiple dictionaries where their keys are concatenations of the whole values using some convention and the values are instances of the whole class:
Dictionary<string, Person> names = new Dictionary<string, Person>();
names.Add("matias", new Person { /* Set members here */ });
Dictionary<string, Person> names = new Dictionary<string, Person>();
names.Add("matias:fidemraizer", new Person { /* Set members here */ });
// And so on, for every criteria you want to search...
Later you would implement a proxy to determine what dictionary to query based on the given criteria.
Actually you should take a look at Redis, which is a key-value store with complex data structures as values like hashes, sets, sorted sets and many more. That is, you could centralize your cache and distribute it using Redis, and you cache could be consumed by many applications.
It's extremely simple to use and install (it's an executable of less than 10MB...).
He has said:
What about a second person with the same first name?
If OP would need to consider this case (yes, it's not an exceptional case, so it's worth the effort to consider it!), it seems like OP would need to store data in a dictionary where keys are the whole composite keys and values should be List<Person>
, HashSet<Person>
or even LinkedList<Person>
.
Furthermore, this would mean that one key (slot) would be able to store many persons, and a query like get a person with first name "Matías"
would always return an implementation of IEnumerable<Person>
(list, hash, linkedlist...), where the whole returned collection would be found persons:
KeyValuePair<CompositeKey, ISet<Person>> result;
if(dictionary.TryGetValue(new CompositeKey(firstName: "Matías"), out result))
{
// I've got either one or many results and I'll decide what to do in
// that case!
}
Also, this enhanced approach has another possible issue. When you query with a composite key like new CompositeKey(firstName: "Matías")
and the whole dictionary store could have stored more than a person with "Matías"
first name, you'll get an ISet<Person>
, IList<Person>
or LinkedList<Person>
.
The first search to get one or many results has a complexity O(1)
(constant time) because the whole composite key is stored based on its hash code, but the returned result of the first search isn't a dictionary anymore and any search against them is going to be O(N) (the more items you get, the more time is taken to find a result).
BTW, if you're trying to find a person by its first name, it's because you know that you can get more than a result and you can't expect to find one unless only one person with the whole first name has been stored in the dictionary.
So it seems that you'll need to disambiguate results if their count is greater than 1
, and this can be done either performing another O(1)
search by providing a composite key with more than the first name, or some human user (or artificial intelligence...) will need to manually choose one of the results.
In summary:
O(1)
complexity):KeyValuePair<CompositeKey, ISet<Person>> result;
if(dictionary.TryGetValue(new CompositeKey(firstName: "Matías"), out result))
{
if(result.Value.Count > 1)
{
// Here you would show the user what you've found in the UI
// and the whole user would choose one of the results directly,
// which is an operation with O(1) complexity
}
else if(result.Value.Count <= 1)
{
// OK, I got 0 or 1 result, this is easier than I thought! ;)
}
}
public KeyValuePair<CompositeKey, ISet<Person>> SearchPerson(CompositeKey key)
{
KeyValuePair<CompositeKey, ISet<Person>> result;
if(dictionary.TryGetValue(new CompositeKey(firstName: "Matías"), out result))
{
if(result.Value.Count > 1)
{
// Oops! More than one result..... BUT I already know another
// component that will make the whole key absolutely unique, so
// I'll call this method recursively to specialize the search even
// more. Obviously, I've hardcoded the ZIP code as a sample, but
// in a real-world case, who knows from where I would get this
// ZIP code... Maybe from some geolocalization query based on current
// user's location?
// Wait, it might happen that a person called Matías could live
// in a location near be so this other person would have stored
// the same ZIP code... Well, this goes outside the scope of this
// Q&A. It's just an example of what to do, in an actual application
// there should be many other choices to disambiguate persons
// automatically...
return SearchPerson(new CompositeKey(firstName: key.FirstName, zipCode: "03984"));
}
else if(result.Value.Count <= 1)
{
// OK, I got 0 or 1 result, this is easier than I thought! ;)
}
}
}
Upvotes: 7
Reputation: 12087
Based on all the ideas here and the fact that I was constrained to using .NET 2.0 I did it like this. Including it just for completeness in case someone comes across this problem again. [Won't mark this as answer because it is based on a lot of ideas from @Matias above so marking his as the answer that contributed the most to the ultimate solution]:
public class PersonKey {
public string FirstName { get; private set; }
public string LastName { get; private set; }
public int Zipcode { get; private set; }
public PersonKey() {
FirstName = null;
LastName = null;
Zipcode = int.MinValue;
}
public PersonKey(int Zipcode, string FirstName) : this() {
this.FirstName = FirstName;
this.Zipcode = Zipcode;
}
public PersonKey(string LastName, string FirstName) : this() {
this.FirstName = FirstName;
this.LastName = LastName;
}
public PersonKey(int Zipcode, string LastName, string FirstName) {
this.Zipcode = Zipcode;
this.LastName = LastName;
this.FirstName = FirstName;
}
public List<string> KeyList {
get {
var keyLst = new List<string>();
if (!String.IsNullOrEmpty(FirstName))
keyLst.Add("FirstName:" + FirstName);
if (!String.IsNullOrEmpty(LastName))
keyLst.Add("LastName:" + LastName);
if (Zipcode != int.MinValue)
keyLst.Add("Zipcode:" + Zipcode);
return keyLst;
}
}
public string Key {
get {
return MakeKey(KeyList.ToArray());
}
}
public List<string[]> AllPossibleKeys {
get {
return CreateSubsets(KeyList.ToArray());
}
}
List<T[]> CreateSubsets<T>(T[] originalArray) {
List<T[]> subsets = new List<T[]>();
for (int i = 0; i < originalArray.Length; i++) {
int subsetCount = subsets.Count;
subsets.Add(new T[] { originalArray[i] });
for (int j = 0; j < subsetCount; j++) {
T[] newSubset = new T[subsets[j].Length + 1];
subsets[j].CopyTo(newSubset, 0);
newSubset[newSubset.Length - 1] = originalArray[i];
subsets.Add(newSubset);
}
}
return subsets;
}
internal string MakeKey(string[] possKey) {
return String.Join(",", possKey);
}
and then I create a cache like this:
//declare the cache
private Dictionary<string, List<Person>> _lookup = new Dictionary<string, List<Person>>();
when I read a record from the database I store it in the cache like this:
var key = new PersonKey(person.ZipCode, person.LastName, person.FirstName);
List<Person> lst;
foreach (var possKey in key.AllPossibleKeys) {
var k = key.MakeKey(possKey);
if (!_lookup.TryGetValue(k, out lst)) {
lst = new List<Person>();
_lookup.Add(k, lst);
}
lst.Add(person);
}
Looking up from the cache is pretty straightforward:
List<Person> lst;
var key = new PersonKey(lastName, firstName); //more constructors can be added in PersonKey
_lookup.TryGetValue(key.Key, out lst);
Upvotes: 0
Reputation: 30062
You can use 3 Lookup
s :
var FirstNamesLookup = data.ToLookup(x => Tuple.Create(x.FirstName), x => x);
var FirstAndLastLookup = data.ToLookup(x => Tuple.Create(x.FirstName, x.LastName), x => x);
var FirstAndZipLookup = data.ToLookup(x => Tuple.Create(x.FirstName, x.zipCode), x => x);
All records with certain FirstName:
var matches = FirstNamesLookup[Tuple.Create("SomeName")].ToList();
All records with certain FirstName and LastName:
var matches = FirstAndLastLookup[Tuple.Create("SomeName", "SomeLastName")].ToList();
Same goes for the third case.
Upvotes: 3
Reputation: 14487
You should just use any generic collection type, and use LINQ to do the lookup :
var addresses = Enumerable.Empty<Address>();
// would return a list of everyone with that first name
addresses.Where(x => x.FirstName == "firstname");
// would return a list of everyone with that first name & last name
addresses.Where(x => x.FirstName == "firstname" && x.LastName == "lastname");
// would return a list of everyone with that first_name in the specified zipcode
addresses.Where(x => x.FirstName == "firstname" && x.ZipCode == "zipcode");
Upvotes: 1