user2822564
user2822564

Reputation: 63

running two interdependent while loops in python?

For a web-scraping analysis I need two loops that run permanently, one returning a list with websites updated every x minutes, while the other one analyses the sites (old an new ones) every y seconds. This is the code construction that exemplifies, what I am trying to do, but it doesn't work: Code has been edited to incorporate answers and my research

from multiprocessing import Process
import time, random

from threading import Lock
from collections import deque

class MyQueue(object):
    def __init__(self):
        self.items = deque()
        self.lock = Lock()

    def put(self, item):
        with self.lock:
            self.items.append(item)
# Example pointed at in [this][1] answer
    def get(self):
        with self.lock:
            return self.items.popleft()

def a(queue):
    while True:
        x=[random.randint(0,10), random.randint(0,10), random.randint(0,10)]
        print 'send', x
        queue.put(x)
        time.sleep(10)


def b(queue):
    try:
        while queue:
            x = queue.get()
            print 'recieve', x
            for i in x:
                print i
            time.sleep(2)
    except IndexError:
        print queue.get()   



if __name__ == '__main__':
    q = MyQueue()
    p1 = Process(target=a, args=(q,))
    p2 = Process(target=b, args=(q,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

So, this is my first Python project after an online introduction course and I am struggling here big time. I understand now, that the functions don't truly run in parallel, as b does not start until a is finished ( I used this answer an tinkered with the timer and while True). EDIT: Even after using the approach given in the answer, I think this is still the case, as the queue.get() throws an IndexError saying, the deque is empty. I can only explain that with process a not finishing, because when I print queue.get() immediately after .put(x) it is not empty.

I eventually want an output like this:

send [3,4,6]
3
4
6
3
4
send [3,8,6,5] #the code above gives always 3 entries, but in my project 
3              #the length varies
8
6
5
3
8
6
.
.

What do I need for having two truly parallel loops where one is returning an updated list every x minutes which the other loop needs as basis for analysis? Is Process really the right tool here? And where can I get good info about designing my program.

Upvotes: 1

Views: 1569

Answers (1)

trelltron
trelltron

Reputation: 577

I did something a little like this a while ago. I think using the Process is the correct approach, but if you want to pass data between processes then you should probably use a Queue.

https://docs.python.org/2/library/multiprocessing.html#exchanging-objects-between-processes

Create the queue first and pass it into both processes. One can write to it, the other can read from it.

One issue I remember is that the reading process will block on the queue until something is pushed to it, so you may need to push a special 'terminate' message of some kind to the queue when process 1 is done so process 2 knows to stop.

EDIT: Simple example. This doesn't include a clean way to stop the processes. But it shows how you can start 2 new processes and pass data from one to the other. Since the queue blocks on get() function b will automatically wait for data from a before continuing.

from multiprocessing import Process, Queue
import time, random

def a(queue):
    while True:
        x=[random.randint(0,10), random.randint(0,10), random.randint(0,10)]
        print 'send', x
        queue.put(x)
        time.sleep(5)


def b(queue):
    x = []
    while True:
        time.sleep(1)
        try:
            x = queue.get(False)
            print 'receive', x
        except:
            pass
        for i in x:
            print i


if __name__ == '__main__':
    q = Queue()
    p1 = Process(target=a, args=(q,))
    p2 = Process(target=b, args=(q,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

Upvotes: 2

Related Questions