Malaski
Malaski

Reputation: 1

Linearizability proof of a toy SET implementation over a filesystem

I'm currently studying concepts related to linearizability and I struggle with going from the linearization points of an implementation to the actual proof of linearizability.

As far as I have understood, once linearization points (LPs) have been specified, one has to prove that for each concurrent execution, the chosen LPs induce an equivalent linearized (sequential) execution, but I don't really understand how to actually do that in practice.

As a toy example, I tried to implemented a (very inefficient) persistent linearizable SET that uses POSIX calls, this would be the python code: (value is both the content and the filename for simplicity here)

class LinearizableSetOnDisk:
    def __init__(self, directory):
        self.directory = directory
        os.makedirs(self.directory, exist_ok=True)

    def add(self, value):
        filepath = os.path.join(self.directory, value)
        temp_path = os.path.join(self.directory, ''.join(random.choices(string.ascii_lowercase, k=10)))

        with open(temp_path, "w") as temp_file:
            temp_file.write(value)

        try:
            os.rename(temp_path, filepath)  # Linearization point
            return True
        except FileExistsError:
            return False


    def remove(self, value):
        filepath = os.path.join(self.directory, value)
        try:
            os.unlink(filepath)  # Linearization point
            return True
        except FileNotFoundError :
            return False

It seems obviously linearizable because os.rename and os.unlink atomically add or remove the value and the OS will impose a total order on these systems calls. Is it really enough to say that to be sure that this SET implementation would be linearizable ?

I feel that I'm still missing something here...

I've tried to enumerate all cases for example for just two concurrent operations (ADD|ADD, ADD|REMOVE, REMOVE|REMOVE) depending on the current state of the SET (i.e. the value is present or not), but I really don't see how to generalize.

Thanks for you help :)

Upvotes: 0

Views: 34

Answers (0)

Related Questions