List lookup is slower than set lookup

Question

I'm currently doing a project in which I have to check multiple times if a given 2D point is "legal", that is if it's within the boundaries of a grid and if it hasn't been visited before. The grid has a fixed H x W size, so I thought I could maybe store them not in a set, but a 2D lookup table. To my surprise, it's significantly slower than checking if it appears in a set, even though it's (in theory) an O(logn) operation, as opposed to O(1) with a simple array lookup.

First try:

class PointSet:
    def __init__(self, h, w):
        if h <= 0 or w <= 0:
            raise ValueError("Invalid size")
        self.h = h
        self.w = w
        self.lookup = [[False for _ in range(w)] for _ in range(h)]
        self.size = 0

    def add(self, point):
        r, c = point
        self.lookup[r][c] = True
        self.size += 1

    def remove(self, point):
        r, c = point
        self.lookup[r][c] = False
        self.size -= 1

    def __contains__(self, point):
        r, c = point
        return 0 <= r < self.h and 0 <= c < self.w and self.lookup[r][c]

    def __bool__(self):
        return self.size != 0

    def __len__(self):
        return self.size

# H, W = ...
# pointset = PointSet(H, W)
# if (3, 5) in pointset:
#   ...

Second try:

#  pointset = set()
#  if (3, 5) in pointset:
#    ...

Second code executes much faster. Why is that the case?

Moses Koledoye · Accepted Answer

First, you have to consider that your implementation of __contains__ method calls __getitem__ twice (because you have a list of lists) with each having 0(1) complexity, and would evaluate two chained comparisons (0 <= r < self.h and 0 <= c < self.w); in the worst case, all the expressions are evaluated since and will only short-circuit on the first False. There's an additional overhead coming from the iterable unpacking in the preceding line. There are also minor additions from the attribute lookups.

Membership lookup for sets is just plain 0(1), so I don't see how your code can possibly beat that.

List lookup is slower than set lookup

Answers (1)

Related Questions