Percy
Percy

Reputation: 3125

Dictionary look up where we want the keys contained in a string

I have a dictionary containing keys, e.g.

"Car"
"Card Payment"

I have a string description, e.g. "Card payment to tesco" and I want to find the item in the dictionary that corresponds to the string.

I have tried this:

var category = dictionary.SingleOrDefault(p => description.ToLowerInvariant().Contains(p.Key)).Value;

This currently results in both "Car" and "Card Payment" being returned from the dictionary and my code blows up as I have SingleOrDefault.

How can I achieve what I want? I thought about prefixing and suffixing the keys in spaces, but I'd have to do the same to the descriptions - I think this would work but it is a bit dirty. Are there any better ways? I have no objections of changing the Dictionary to some other type as long as performance is not impacted too much.

Required Result for above example: only get "Card Payment"

Upvotes: 3

Views: 118

Answers (3)

Olivier Jacot-Descombes
Olivier Jacot-Descombes

Reputation: 112352

You are abusing dictionaries. You will get no performance gain from dictionaries by scanning the keys. Even worse, a simple list would be faster in this case. Dictionaries approach a constant time access (O(1)) if you look up a value by the key.

if (dictionary.TryGetValue(key, out var value)) { ...

To be able to use this advantage you will need a more subtle approach. The main difficulty is that sometimes keys might consist of more than a single word. Therefore I would suggest a two level approach where at the first level you store single word keys and at the second level you store the composed keys and values.

Example: Key value pairs to be stored:

["car"]: categoryA
["card payment"]: categoryB
["payment"]: categoryC

We build a dictionary as

var dictionary = new Dictionary<string, List<KeyValuePair<string, TValue>>> {
    ["car"] = new List<KeyValuePair<string, TValue>> {
        new KeyValuePair("car", categoryA)
    },
    ["card"] = new List<KeyValuePair<string, TValue>> {
        new KeyValuePair("card payment", categoryB)
    },
    ["payment"] = new List<KeyValuePair<string, TValue>> {
        new KeyValuePair("card payment", categoryB),
        new KeyValuePair("payment", categoryC)
    }
};

Of course, in reality, we would do this using an algorithm. But the point here is to show the structure. As you can see, the third entry for the main key "payment" contains two entries: One for "card payment" and one for "payment".

The algorithm for adding values goes like this:

  1. Split the key the be entered into single words.
  2. For each word, create a dictionary entry using this word as main key and store a key value pair in a list as dictionary value. This second key is the original key possibly consisting of several words.

As you can imagine, step 2 requires you to test whether an entry with the same main key is already there. If yes, then add the new entry to the existing list. Otherwise create a new list with a single entry and insert it into the dictionary.

Retrieve an entry like this:

  1. Split the key the be entered into single words.
  2. For each word, retrieve the existing dictionary entries using a true and therefore fast dictionary lookup(!) into a List<List<KeyValuePair<string, TValue>>>.
  3. Flatten this list of lists using SelectMany into a single List<KeyValuePair<string, TValue>>
  4. Sort them by key length in descending order and test whether the description contains the key. The first entry found is the result.

You can also combine steps 2 and 3 and directly add the list entries of the single dictionary entries into a main list.

Upvotes: 0

Rajib Chy
Rajib Chy

Reputation: 880

Here I'm using List<string> keys and System.Text.RegularExpressions find desired key.
Try it.

string description = "Card payment to tesco";
List<string> keys = new List<string> {
    {"Car" }, {"Card Payment" }
};
string desc = description.ToLowerInvariant( );
string pattern = @"([{0}]+) (\S+)";
var resp = keys.FirstOrDefault( a => {
    var regx = new Regex( string.Format( pattern, a.ToLowerInvariant( ) ) );
    return regx.Match( desc ).Success;
} );

Check here .NET Fiddle

Upvotes: 0

D-Shih
D-Shih

Reputation: 46219

You can try to use linq OrderByDescending and Take after your where condition. to find the most match word value.

var category = dictionary
               .Where(p => description.ToLowerInvariant().Contains(p.Key.ToLowerInvariant()))
               .OrderByDescending(x => x.Key.Length)
               .Take(1);

c# online


I would use List<string> to contain your keys, because there isn't any reason need to use a key and value collection.

List<string> keys = new List<string>();
keys.Add("Car");
keys.Add("Card Payment");

string description = "Card payment to tesco";

var category = keys
        .Where(p => description.ToLowerInvariant().Contains(p.ToLowerInvariant()))
        .OrderByDescending(x => x.Length)
        .Take(1)
        .FirstOrDefault();

NOTE

OrderBy key values length desc can make sure which key is the most match word value.

Upvotes: 2

Related Questions