Reputation: 431
I have a system designed to take data via a socket and store that into a dictionary to serve as a database. Then all my other modules (GUI, analysis, write_to_log_file, etc) will access the database and do what they need to do with the dictionary e.g make widgets/copy the dictionary to a log file. But since all these things happen at a different rate, I chose to have each module on their own thread so I can control the frequency.
In the main run function there's something like this:
from threading import Thread
import data_collector
import write_to_log_file
def main():
db = {}
receive_data_thread = Thread(target=data_collector.main, arg=(db,))
recieve_data_thread.start() # writes to dictionary @ 50 Hz
log_data_thread = Thread(target=write_to_log_file.main, arg(db,))
log_data_thread.start() # reads dictionary @ 1 Hz
But it seems that both modules aren't working on the same dictionary instance because the log_data_thread just prints out the empty dictionary even when the data_collector shows the data it's inserted into the dictionary.
There's only one writer to the dictionary so I don't have to worry about threads stepping on each others toes, I just need to figure out a way for all the modules to read the current database as it's being written.
Upvotes: 0
Views: 2965
Reputation: 431
Sorry, I figured out my problem, and I'm dumb. The modules were working on the same dictionary, but my logger wasn't wrapped around a while True
so it just executed once and terminated the thread and thus my dictionary was only logged to disk once. So I made write_to_log_file.main(db)
constantly write at 1Hz forever and set log_data_thread.deamon = True
so that once the writer thread (which won't be a daemon thread) exits, it'll quit. Thanks for all the input about best practices on this type of system.
Upvotes: 0
Reputation: 196
This should not be a problem. I also assume you are using the threading module. I would have to know more about what the data_collector and write_to_log_file are doing to figure out why they are not working.
You could technically even have more then 1 thread writing and it would not be a problem because the GIL would take care of all the locking needed. Granted you will never get more then one cpus worth of work out of it.
Here is a simple Example:
import threading, time
def addItem(d):
c = 0
while True:
d[c]="test-%d"%(c)
c+=1
time.sleep(1)
def checkItems(d):
clen = len(d)
while True:
if clen < len(d):
print "dict changed", d
clen = len(d)
time.sleep(.5)
DICT = {}
t1 = threading.Thread(target=addItem, args=(DICT,))
t1.daemon = True
t2 = threading.Thread(target=checkItems, args=(DICT,))
t2.daemon = True
t1.start()
t2.start()
while True:
time.sleep(1000)
Upvotes: 0
Reputation: 7842
Rather than using a builtin dict
, you could look at using a Manager
object from the multiprocessing
library:
from multiprocessing import Manager
from threading import Thread
from time import sleep
manager = Manager()
d = manager.dict()
def do_this(d):
d["this"] = "done"
def do_that(d):
d["that"] ="done"
thread0 = Thread(target=do_this,args=(d,))
thread1 = Thread(target=do_that,args=(d,))
thread0.start()
thread1.start()
thread0.join()
thread1.join()
print d
This gives you a standard-library thread-safe synchronised dictionary which should be easy to swap in to your current implementation without changing the design.
Upvotes: 3
Reputation: 880877
Use a Queue.Queue to pass values from the reader threads to a single writer thread. Pass the Queue instance to each data_collector.main
function. They can all call the Queue's put
method.
Meanwhile the write_to_log_file.main
should also be passed the same Queue instance, and it can call the Queue's get
method.
As items are pulled out of the Queue, they can be added to the dict
.
See also: Alex Martelli, on why Queue.Queue is the secret sauce of CPython multithreading.
Upvotes: 0