Ned U
Ned U

Reputation: 431

How to make a multithreaded system work on same dictionary in python

I have a system designed to take data via a socket and store that into a dictionary to serve as a database. Then all my other modules (GUI, analysis, write_to_log_file, etc) will access the database and do what they need to do with the dictionary e.g make widgets/copy the dictionary to a log file. But since all these things happen at a different rate, I chose to have each module on their own thread so I can control the frequency.

In the main run function there's something like this:

    from threading import Thread
    import data_collector
    import write_to_log_file

    def main():
        db = {}
        receive_data_thread = Thread(target=data_collector.main, arg=(db,))
        recieve_data_thread.start()  # writes to dictionary  @ 50 Hz
        log_data_thread = Thread(target=write_to_log_file.main, arg(db,))
        log_data_thread.start()  # reads dictionary  @ 1 Hz

But it seems that both modules aren't working on the same dictionary instance because the log_data_thread just prints out the empty dictionary even when the data_collector shows the data it's inserted into the dictionary.

There's only one writer to the dictionary so I don't have to worry about threads stepping on each others toes, I just need to figure out a way for all the modules to read the current database as it's being written.

Upvotes: 0

Views: 2965

Answers (4)

Ned U
Ned U

Reputation: 431

Sorry, I figured out my problem, and I'm dumb. The modules were working on the same dictionary, but my logger wasn't wrapped around a while True so it just executed once and terminated the thread and thus my dictionary was only logged to disk once. So I made write_to_log_file.main(db) constantly write at 1Hz forever and set log_data_thread.deamon = True so that once the writer thread (which won't be a daemon thread) exits, it'll quit. Thanks for all the input about best practices on this type of system.

Upvotes: 0

Luke Wahlmeier
Luke Wahlmeier

Reputation: 196

This should not be a problem. I also assume you are using the threading module. I would have to know more about what the data_collector and write_to_log_file are doing to figure out why they are not working.

You could technically even have more then 1 thread writing and it would not be a problem because the GIL would take care of all the locking needed. Granted you will never get more then one cpus worth of work out of it.

Here is a simple Example:

import threading, time

def addItem(d):
  c = 0
  while True:
    d[c]="test-%d"%(c)
    c+=1
    time.sleep(1)

def checkItems(d):
  clen = len(d)
  while True:
    if clen < len(d):
      print "dict changed", d
      clen = len(d)
    time.sleep(.5)

DICT = {}
t1 = threading.Thread(target=addItem, args=(DICT,))
t1.daemon = True
t2 = threading.Thread(target=checkItems, args=(DICT,))
t2.daemon = True
t1.start()
t2.start()

while True:
  time.sleep(1000)

Upvotes: 0

ebarr
ebarr

Reputation: 7842

Rather than using a builtin dict, you could look at using a Manager object from the multiprocessing library:

from multiprocessing import Manager
from threading import Thread
from time import sleep

manager = Manager()
d = manager.dict()

def do_this(d):
    d["this"] = "done"

def do_that(d):
    d["that"] ="done"

thread0 = Thread(target=do_this,args=(d,))
thread1 = Thread(target=do_that,args=(d,))
thread0.start()
thread1.start()
thread0.join()
thread1.join()

print d

This gives you a standard-library thread-safe synchronised dictionary which should be easy to swap in to your current implementation without changing the design.

Upvotes: 3

unutbu
unutbu

Reputation: 880877

Use a Queue.Queue to pass values from the reader threads to a single writer thread. Pass the Queue instance to each data_collector.main function. They can all call the Queue's put method.

Meanwhile the write_to_log_file.main should also be passed the same Queue instance, and it can call the Queue's get method. As items are pulled out of the Queue, they can be added to the dict.

See also: Alex Martelli, on why Queue.Queue is the secret sauce of CPython multithreading.

Upvotes: 0

Related Questions