Mark Grey
Mark Grey

Reputation: 10257

Python - set somehow getting duplicate data

I have an class definition with a __hash__ function that uses the object properties to create a unique key for comparison in python sets.

The hash method looks like this:

def __hash__(self):
return int('%d%s'%(self.id,self.create_key))

In a module responsible for implementing this class, several queries are run that could conceivably construct duplicate instances of this class, and the queue that is created in the function responsible for doing this is a represented as a set to make sure the the dupes can be omitted:

in_set = set()
  out_set = set()
  for inid in inids:
    ps = Perceptron.getwherelinked(inid,self.in_ents)

for p in ps:
  in_set.add(p)


  for poolid in poolids:
  ps = Perceptron.getwherelinked(poolid,self.out_ents)
  for p in ps:
    out_set.add(p)
  return in_set.union(out_set)

(Not sure why the indenting got mangled here)

Somehow, despite calling the union method, I am still getting the two duplicate instances. When printed out (with a str method in the Perceptron class that just calls hash) the two hashes are identical, which theoretically shouldn't be possible.

set([1630, 1630])

Any guidance would be appreciated.

Upvotes: 0

Views: 179

Answers (2)

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799120

If a class does not define a __cmp__() or __eq__() method it should not define a __hash__() operation either

source

Define __eq__().

Upvotes: 4

Sven Marnach
Sven Marnach

Reputation: 602175

You also need to implement __eq__() to match your __hash__() implementation.

Upvotes: 1

Related Questions