Mark
Mark

Reputation: 722

Keys in python dictionaries

The problem that I am having is distributed over many source files and my attempts to reproduce the problem in a simple linear format have failed. Nonetheless the problem I am having is simply described.

I have a class Path for which I implement __hash__ and __eq__

I have an item of type Path in a dict as evidenced by

path in list(thedict)
>> True

I verify that path == other and hash(path) == hash(other) and id(path) == id(other) where other is an item taken straight out of list(thedict.keys()). Yet, I get the following

path in thedict:
>> False

and attempting the following results in a KeyError

thedict[path]

So my question is, under what circumstance is this possible? I would have expected that if the path is in list(thedict) then it must be in thedict.keys() and hence we must be able to write thedict[path]. What is wrong with this assumption?

Further Info

If it helps, the classes in question are listed below. It is at the level of SpecificationPath that the above issue is observed

class Path:
    pass

@dataclass
class ConfigurationPath(Path):
    configurationName: str = None
    
    def __repr__(self) -> str:
        return self.configurationName

    def __hash__(self):
        return hash(self.configurationName)

    def __eq__(self, other):
        if not isinstance(other, ConfigurationPath):
            return False
        return self.configurationName == other.configurationName

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

@dataclass
class SpecificationPath(Path):
    configurationPath: ConfigurationPath
    specificationName: str = None
    
    def __repr__(self) -> str:
        return f"{self.configurationPath}.{self.specificationName or ''}"

    def __hash__(self):
        return hash((self.configurationPath, self.specificationName))
    
    def __eq__(self, other):
        if not isinstance(other, SpecificationPath):
            return False
        if self.configurationPath != other.configurationPath:
            return False
        if self.specificationName != other.specificationName:
            return False
        return True

In response to a comment below, here is the output in the (Spyder) debug terminal, where pf is an object containing the paths dictionary using paths as keys and the object in question (self) has the path.

In : others = list(pf.paths.keys())
In : other = others[1]
In : self.path is other
Out[1]: True
In : self.path in pf.paths
Out[1]: False

Upvotes: 0

Views: 90

Answers (1)

ShadowRanger
ShadowRanger

Reputation: 155594

Per your comment:

The paths do want to be mutable as I am setting specificationName to None in places (leaving them Anonymous to be filled out later). Further, it is on an instance where the specificationName is None that this occurs, however in my simple test scripts I can get away with setting this to None without an error. Could mutability of the hashable instances cause an error such as this?

There's your problem. You're putting these objects in a dict immediately after creation, while specificationName is None, so it's stored in the dict with a hashcode based on None (that hashcode is cached in the dict itself, and using that hashcode is the only way to look up the object in the future). If you subsequently change it to anything that produces a different hash value (read almost everything else), the object is stored in a bucket corresponding to the old hash code, but using it for lookups computes the new hash code and cannot find that bucket.

If specificationName must be mutable, then it cannot be part of the hash, it's as simple as that. This will potentially increase collisions, but it can't be helped; a mutable field can't be part of the hash without triggering this exact problem.

Upvotes: 6

Related Questions