SKB
SKB

Reputation: 199

Multiprocessing in django and Python code

I am trying to implement multiprocessing in my application on windows system.

The scenario is : From GUI, when i click "Run" button control comes to a python function(which is not a main function).

Now in this function I am running loop and reading/executing multiple file one at a time. I want this to happen in parallel.

But as multiprocessing.process() need __name__ ='__main__', my function mentioned in "target = function name" in multiprocessing() is not being invoked.

How can I make it happen. If multiprocessing seems wrong way then any alternative way to improve code performance?

Adding Sample code(please note that this is just a psudo code where i have added high level code to understand the flow, please excuse any syntax error) :

urls.py file:

from django.urls import path
from textapp import views

urlpatterns = [
    path('execute/',views.functiontomultiprocess),
    ...
    other urls
    ]

views.py:

def functiontomultiprocess(request):
nprocess = []
for doc in alldocs:
   p = multiprocess.Process(function2)
   p.start() # start process
   nprocess.append(p) 

 for  p1 in nprocess:
   p1.join()

Upvotes: 5

Views: 2870

Answers (2)

Timur U
Timur U

Reputation: 445

Task runner can use, in particular Celery.

By means of Celery it is possible to create "turn of tasks":

my_task.py

from celery import task

@task
def myJob(*args,**kwargs):
    # main task
    # . . .

my_views.py

from django.shortcuts import render_to_response as rtr

from .tasks import myJob

def view(request):
    # view
    # . . .
    myJob.delay(*args,**kwargs)
    return rtr('template.html', {'message': 'Job has been entered'})

The call of .delay will register * myJob * for performance by one of yours * celery *, but won't block representation performance.

The task isn't carried out until the worker doesn't become free therefore you should have no problems with number of processes.

Upvotes: 2

Booboo
Booboo

Reputation: 44118

This is too long to specify in a comment, so:

Again, I have no expertise in Django, but I would think this would not cause a problem on either Windows or Linux/Unix. However, you did not specify your platform, which was requested. But moreover, the code you provided would accomplish very little because your loop creates a process and waits for it to complete before creating the next process. In the end you never have more than one process running at a time and thus there is no parallelism. To correct that, try the following:

def functiontomultiprocess(request):
    processes = []
    for doc in alldocs: # where is alldocs defined?
        p = multiprocess.Process(function2, args=(doc,)) # pass doc to function2
        processess.append(p)
        p.start()
    # now wait for the processes to complete
    for p in processes:
        p.join()

Or if you want to use a pool, you have choices. This uses the concurrent.futures module:

import concurrent.futures

def functiontomultiprocess(request):
    """
    Does it make sense to create more processes than CPUs you have?
    It might if there is a lot of I/O. In which case try:
    n_processes = len(alldocs)
    """
    n_processes = min(len(alldocs), multiprocessing.cpu_count())
    with concurrent.futures.ProcessPoolExecutor(max_workers=n_processes) as executor:
        futures = [executor.submit(function2, doc) for doc in alldocs] # create sub-processes
        return_values = [future.result() for future in futures] # get return values from function2

This uses the multiprocessing module:

import multiprocessing

def functiontomultiprocess(request):
    n_processes = min(len(alldocs), multiprocessing.cpu_count())
    with multiprocessing.Pool(processes=n_processes) as pool:
        results = [pool.apply_async(function2, (doc,)) for doc in alldocs] # create sub-processes
        return_values = [result.get() for result in results] # get return values from function2

Now you just have to try it and see.

Upvotes: 3

Related Questions