Reputation: 199
I am trying to implement multiprocessing in my application on windows system.
The scenario is : From GUI, when i click "Run" button control comes to a python function(which is not a main function).
Now in this function I am running loop and reading/executing multiple file one at a time. I want this to happen in parallel.
But as multiprocessing.process() need __name__ ='__main__'
, my function mentioned in "target = function name
" in multiprocessing() is not being invoked.
How can I make it happen. If multiprocessing seems wrong way then any alternative way to improve code performance?
Adding Sample code(please note that this is just a psudo code where i have added high level code to understand the flow, please excuse any syntax error) :
urls.py file:
from django.urls import path
from textapp import views
urlpatterns = [
path('execute/',views.functiontomultiprocess),
...
other urls
]
views.py:
def functiontomultiprocess(request):
nprocess = []
for doc in alldocs:
p = multiprocess.Process(function2)
p.start() # start process
nprocess.append(p)
for p1 in nprocess:
p1.join()
Upvotes: 5
Views: 2870
Reputation: 445
Task runner can use, in particular Celery.
By means of Celery it is possible to create "turn of tasks":
my_task.py
from celery import task
@task
def myJob(*args,**kwargs):
# main task
# . . .
my_views.py
from django.shortcuts import render_to_response as rtr
from .tasks import myJob
def view(request):
# view
# . . .
myJob.delay(*args,**kwargs)
return rtr('template.html', {'message': 'Job has been entered'})
The call of .delay will register * myJob * for performance by one of yours * celery *, but won't block representation performance.
The task isn't carried out until the worker doesn't become free therefore you should have no problems with number of processes.
Upvotes: 2
Reputation: 44118
This is too long to specify in a comment, so:
Again, I have no expertise in Django, but I would think this would not cause a problem on either Windows or Linux/Unix. However, you did not specify your platform, which was requested. But moreover, the code you provided would accomplish very little because your loop creates a process and waits for it to complete before creating the next process. In the end you never have more than one process running at a time and thus there is no parallelism. To correct that, try the following:
def functiontomultiprocess(request):
processes = []
for doc in alldocs: # where is alldocs defined?
p = multiprocess.Process(function2, args=(doc,)) # pass doc to function2
processess.append(p)
p.start()
# now wait for the processes to complete
for p in processes:
p.join()
Or if you want to use a pool, you have choices. This uses the concurrent.futures
module:
import concurrent.futures
def functiontomultiprocess(request):
"""
Does it make sense to create more processes than CPUs you have?
It might if there is a lot of I/O. In which case try:
n_processes = len(alldocs)
"""
n_processes = min(len(alldocs), multiprocessing.cpu_count())
with concurrent.futures.ProcessPoolExecutor(max_workers=n_processes) as executor:
futures = [executor.submit(function2, doc) for doc in alldocs] # create sub-processes
return_values = [future.result() for future in futures] # get return values from function2
This uses the multiprocessing
module:
import multiprocessing
def functiontomultiprocess(request):
n_processes = min(len(alldocs), multiprocessing.cpu_count())
with multiprocessing.Pool(processes=n_processes) as pool:
results = [pool.apply_async(function2, (doc,)) for doc in alldocs] # create sub-processes
return_values = [result.get() for result in results] # get return values from function2
Now you just have to try it and see.
Upvotes: 3