Reputation:
I handed this in as part a school assignment and the person who marked it mentioned that this section was not thread safe.
The assignment was to create a multithreaded socket server in python that accepted a number and returned the Fibonacci value of that number. My approach was to memoize the calculations by sharing a dictionary between each of the threads.
Here is the code (with error handling and such removed for brevity)
from socketserver import ThreadingMixIn, TCPServer, BaseRequestHandler
class FibonacciThreadedTCPServer(ThreadingMixIn, TCPServer):
def __init__(self, server_address):
TCPServer.__init__(self, server_address, FibonacciThreadedTCPRequestHandler, bind_and_activate=True)
#this dictionary will be shared between all Request handlers
self.fib_dict = {0: 0, 1: 1, 2: 1}
class FibonacciThreadedTCPRequestHandler(BaseRequestHandler):
def handle(self):
data = self.request.recv(1024).strip()
num = int(data)
result = self.calc_fib(self.server.fib_dict, num)
ret = bytes(str(result) + '\n', 'ascii')
self.request.sendall(ret)
@staticmethod
def calc_fib(fib_dict, n):
"""
Calculates the fibonacci value of n using a shared lookup table and a linear calculation.
"""
length = len(fib_dict)
while length <= n:
fib_dict[length] = fib_dict[length - 1] + fib_dict[length - 2]
length = len(fib_dict)
return fib_dict[n]
I understand that there are reads and writes occuring simultaneously in the calc_fib method, and normally that means that code is not thread-safe. However in this instance I think it's possible to prove that the code will always provide predictable results.
Is the fact that reads and writes can be occuring simultaneously enough for it to not be considered thread-safe? Or is something considered thread-safe if it always returns a reliabel result.
Why I think this code will always produce reliable results:
A read will never occur on any given index in the dictionary until a write has occurred there.
Any subsequent writes to any given index will contain the same number as the previous writes, so regardless of when the read/write sequence occurs it will always receive the same data.
I have tested this by added random sleeps between each operation and making requests with several hundred threads simultaneously and the right answer is already returned during my test.
Any thoughts or criticism would be appreciated. Thanks.
Upvotes: 7
Views: 5554
Reputation: 155418
In this particular case, the GIL should keep your code safe, because:
dict
in particular needs this guarantee, because class instances and non-local scope usually use dict
s for attribute/name lookup, and without the GIL, simply reading the values would be fraught with danger)So in CPython, this should be fine. I can't make any guarantees for other Python interpreters though; without the GIL, if they implement their dict
without internal locking, it's wholly possible a rehash operation triggered by a write in one thread could cause another thread to read from a dict
in an inconsistent/unusable state.
Upvotes: 4
Reputation: 27115
First of all, why do you think that dictionaries are thread-safe? I did a quick search of Python3 documentation (I'm a Python novice myself), and I can not find any reassurance that two unsynchronized threads can safely update the same dictionary without corrupting the dictionary internals and maybe crashing the program.
I've been writing multi-threaded code in other languages since 1980-something, and I've learned never to trust that something is thread safe just because it acts that way when I test it. I want to see documentation that it's supposed to be thread safe. Otherwise, I throw a mutex around it.
Second of all, you are assuming that fib_dict[length - 1]
and fib_dict[length - 2]
will be valid. My experience with other programming languages says not to assume it. In other programming languages, (e.g., Java), when threads share data without synchronization, it is possible for one thread to see variable updates happen in a different order from the order in which some other thread performed them. E.g., it is theoretically possible for a Java thread that accesses a Map
with no synchronization to see the size()
of the Map
increase before it sees the new values actually appear in the map. I will assume that something similar could happen in Python until somebody shows me official documentation that says otherwise.
Upvotes: 1