Rawle
Rawle

Reputation: 199

Dictionary<string, string> Value lookup performance

I am working on a small project but have run into a performance roadblock.

I have a Dictionary<string, string>()

I have a string[].

Lets say my Dictionary has 50,000 entries, and my string[] has 30,000 entries.

I want to collect the Keys from my Dictionary where the value.ToCharArray().OrderBy(x => x) equals a value.ToCharArray().OrderBy(x => x) of my string[].

I have tried reducing the number of KeyValue pairs I have to look through by comparing the length of my string[] value to the values in the Dictionary, but that has not really gained me any performance.

Does anyone have an ideas how I can improve the performance of this lookup?

Thanks!

To expand the pseudocode:

var stringToLookUp = GetSomeStrings(s.ToString()).Select(x => x).OrderBy(x => x).ToArray();
var aDictionaryOfStringString = GetDictionary(Resources.stringList);

var results = new List<string>();

foreach (var theString in stringToLookUp.Where(aString=> aString.Length > 0))
{
    if (theString.Length > 0)
    {
        var theStringClosure = theString;

        var filteredKeyValuePairs = aDictionaryOfStringString.Where(w => w.Value.Length == theStringClosure.Length && !results.Contains(w.Key)).ToArray();
        var foundStrings = filteredKeyValuePairs.Where(kv => kv.Value.ToCharArray().OrderBy(c => c).ToArray().SequenceEqual(theStringClosure))
                .Select(kv => kv.Key)
                .ToArray();
        if (foundStrings.Any()) results.AddRange(foundStrings);
    }
}

Upvotes: 0

Views: 1478

Answers (2)

Ondrej Svejdar
Ondrej Svejdar

Reputation: 22094

I think principal problem is you iterate over whole dictionary in every single iteration - this is O(N^2). Better build hashset based on your modified key (either from dictionary or from array) and iterate over the second. This is O(N).

// some values
var dictionary = new Dictionary<string, string>();
var fields = new string[]{};


string[] modifiedFields = new string[fields.Length];
for(var i =0; i < fields.Length; i++)
{
  modifiedFields[i] = new string(fields[i].ToCharArray().OrderBy(x =>x).ToArray());
}
var set = new HashSet<string>(modifiedFields);
var results = new List<string>();
foreach(var pair in dictionary)
{
  string key = new string(pair.Value.ToCharArray().OrderBy(x =>x).ToArray());
  if (set.Contains(key))
  {
    results.Add(pair.Key);
  }
}

Upvotes: 2

Guy
Guy

Reputation: 1444

You can try this

var stringToLookUp = GetSomeStrings(s.ToString()).Select(x => x).OrderBy(x => x).ToArray();
var aDictionaryOfStringString = GetDictionary(Resources.stringList);

var results = aDictionaryOfStringString.Where(kvp => stringToLookUp.Select(s => s.OrderBy(x => x)).Contains(kvp.Value.OrderBy(x => x))).Select(kvp => kvp.Key).ToList();

Upvotes: 0

Related Questions