Ruben
Ruben

Reputation: 1437

Multiple threads needing to access a single resource

I'm currently working on a program where multiple threads need to access a single array list. The array functions as a "buffer". One or more threads write into this list and one or more other threads read and remove from this list. My first question is, are array's in Python thread safe? If not, what is a standard approach of dealing with situation?

Upvotes: 1

Views: 1463

Answers (3)

mkind
mkind

Reputation: 2043

You need Locks like ATOzTOA mentioned. You create them by

lock = threading.Lock()

and the threads acquire them if they enter a critical section. After finishing the section, they release the lock. The pythonic way to write this is

with lock:
   do_something(buffer)

Upvotes: 0

Shimon Tolts
Shimon Tolts

Reputation: 1692

You should use the queue lib. here is a good article explaining about threading and queues.

import Queue
import threading
import urllib2
import time
from BeautifulSoup import BeautifulSoup

hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com",
        "http://ibm.com", "http://apple.com"]

queue = Queue.Queue()
out_queue = Queue.Queue()

class ThreadUrl(threading.Thread):
    """Threaded Url Grab"""
    def __init__(self, queue, out_queue):
        threading.Thread.__init__(self)
        self.queue = queue
        self.out_queue = out_queue

    def run(self):
        while True:
            #grabs host from queue
            host = self.queue.get()

            #grabs urls of hosts and then grabs chunk of webpage
            url = urllib2.urlopen(host)
            chunk = url.read()

            #place chunk into out queue
            self.out_queue.put(chunk)

            #signals to queue job is done
            self.queue.task_done()

class DatamineThread(threading.Thread):
    """Threaded Url Grab"""
    def __init__(self, out_queue):
        threading.Thread.__init__(self)
        self.out_queue = out_queue

    def run(self):
        while True:
            #grabs host from queue
            chunk = self.out_queue.get()

            #parse the chunk
            soup = BeautifulSoup(chunk)
            print soup.findAll(['title'])

            #signals to queue job is done
            self.out_queue.task_done()

start = time.time()
def main():

    #spawn a pool of threads, and pass them queue instance
    for i in range(5):
        t = ThreadUrl(queue, out_queue)
        t.setDaemon(True)
        t.start()

    #populate queue with data
    for host in hosts:
        queue.put(host)

    for i in range(5):
        dt = DatamineThread(out_queue)
        dt.setDaemon(True)
        dt.start()


    #wait on the queue until everything has been processed
    queue.join()
    out_queue.join()

main()
print "Elapsed Time: %s" % (time.time() - start)

Upvotes: 1

ATOzTOA
ATOzTOA

Reputation: 35950

Try using Threading.lock if there is only one resource.

Upvotes: 1

Related Questions