Wild Goat
Wild Goat

Reputation: 3579

Hashing set of objects C#

I have number of orders and each order contains purchased Item objects.

1 : {Item1, Item2, Item3, Item4, Item5}  
2 : {Item2, Item8, Item4, Item3, Item11, Item5} 
3 : { ... }

My goal is to establish how frequent each of those items were bought together and able to get results in O(1).

My idea was iterate through orders, based on subset items - increment particular array's element. That will give me possibility extract required value in O(1).

Eg. Item3 and Item4 were bought 2 times.

int frequency = myArray[getHash(Item3+Item4)]

print frequency;

Output : 2

Problem:

Develop a int getHash(...) function, which will be able to hash subset of items.

Note: {Item1, Item2} = {Item2, Item1}

Thank you very much! Any help of better ideas are welcome!

Upvotes: 2

Views: 426

Answers (2)

LightStriker
LightStriker

Reputation: 21004

I'm sorry, I'm not using hash, but I tried to give it a go in a way I would do it. Just like to try to solve that kind of challenge.

Dictionary<Item, Dictionary<Item, Count>> combine = new Dictionary<Item, Dictionary<Item, Count>>();

foreach (Item item in Sell)
{
    Dictionary<Item, int> key;
    if (!combine.TryGetValue(item, out key))
    {
        key = new Dictionary<Item, Count>();
        combine.Add(item, key);
    }

    foreach (Item otherItem in Sell)
    {
        if (item == otherItem)
            continue;

        Count count;
        if (key.TryGetValue(otherItem, out count))
            count++;
        else
            key.Add(otherItem, new Count());
    }
}

That's probably very stupid, as for each item you ends up with a dictionary of all other items bought at the same time with a counter. And if you want to know if Item1 is bought at the same time as Item2 AND Item3 vs Item2 OR Item3... Bleh. Nevermind me.

Upvotes: 0

corsiKa
corsiKa

Reputation: 82559

Because {A,B} = {B,A} You first need to sort your list before proceeding. It doesn't matter what you sort by, but you do need to ensure that no values are considered equal for sorting purposes unless they can be interchangeable in their ordering.

Next, any simple hashing algorithm should work. A common technique is to use two primes, I'll call them c and p.

int hash = c;
foreach(Item i in items) hash = hash * p + i.GetHashCode()
return hash;

p is sometimes chosen to be 31 because not only is it prime, but a compiler resolves it to a bitshift and a subtract, which is much faster than a multiply. x * 31 is the same as (x << 5) - 1 (assuming I used the proper shift... I screw that up from time to time, haha.)

Upvotes: 4

Related Questions