Shazam
Shazam

Reputation: 708

Hashing a tuple in Python where order matters?

I have:

tuple1 = token1, token2
tuple2 = token2, token1
for tuple in [tuple1, tuple2]:
    if tuple in dict:
        dict[tuple] += 1
    else:
        dict[tuple] = 1

However, both tuple 1 and tuple2 are getting the same counts. What is a way to hash a group of 2 things such that order matters?

Upvotes: 6

Views: 23009

Answers (3)

mgilson
mgilson

Reputation: 309929

Order is taken into account when hashing:

>>> hash((1,2))
1299869600
>>> hash((2,1))
1499606158

This assumes that the objects themselves have unique hashes. Even if they don't, you could still be OK when using it in a dictionary (as long as the objects themselves aren't equal as defined by their __eq__ method):

>>> t1 = 'a',hash('a') 
>>> [hash(x) for x in t1]  #both elements in the tuple have same hash value since `int` hash to themselves in cpython
[-468864544, -468864544]
>>> t2 = hash('a'),'a'
>>> hash(t1)
1486610051
>>> hash(t2)
1486610051
>>> d = {t1:1,t2:2}  #This is OK.  dict's don't fail when there is a hash collision
>>> d
{('a', -468864544): 1, (-468864544, 'a'): 2}
>>> d[t1]+=7
>>> d[t1]
8
>>> d[t1]+=7
>>> d[t1]
15
>>> d[t2]   #didn't touch d[t2] as expected.
2

Note that due to hash collisions, this dict is likely to be less efficient than another dict where there aren't hash collisions :)

Upvotes: 23

inspectorG4dget
inspectorG4dget

Reputation: 113975

It seems like you have posted one instance of the body of a loop. Might I suggest that you use collections.Counter for what you are trying to do, which does exactly what you want, but in one line:

counter = (collections.Counter(myListOfTuples) + 
           collections.Counter([j,i for i,j in myListOfTuples]))

Upvotes: 0

NPE
NPE

Reputation: 500437

The reason they are getting the same count is that your code explicitly increments both token1,token2 and token2,token1 counts at the same time. If you don't, the counts won't stay in lockstep:

In [16]: import collections

In [17]: d = collections.defaultdict(int)

In [18]: d[1,2] += 1

In [19]: d[1,2]
Out[19]: 1

In [20]: d[2,1]
Out[20]: 0

Upvotes: 7

Related Questions