Reputation: 1540
i am new to multi process development. I have to rename a lot of files (approximatively 70000)
so here is what i have done :
import os
import sys
import glob
from multiprocessing import Process
cst_id = sys.argv[1]
data=sys.argv[2]
main_path="/scality/hmo02/data/fdata/"+data+"/isei/"+cst_id+"/"
nb_process=100
def rename(file_lst):
for f in file_lst:
split_path=os.path.split(f)
container_path=split_path[0]
dir_to_rename=split_path[1]
if(dir_to_rename.startswith('001_')):
pass
else:
new_path=container_path+"/001_"+dir_to_rename
os.rename(f,new_path)
def chunks(l, n):
return [l[x: x+n] for x in range(0, len(l), n)]
if(data=="odata"):
id_data_file_lst=glob.glob(main_path+"*/*/*/*")
chunked_lst=chunks(id_data_file_lst,nb_process)
proc_lst=[]
for lst in chunked_lst:
proc=Process(target=rename, args=(lst))
proc.start()
proc_lst.append(proc)
for p in proc_lst:
p.join()
I have an other rename to do after this one but i have to be sure the first treatment is over to launch my other treatment. My question is : how to know if all processes are terminated ?
i am not sure of how i use p.join()
Upvotes: 2
Views: 2184
Reputation: 16448
You can read https://docs.python.org/3/library/multiprocessing.html. With multiprocessing.Process.join
you can wait until the process has finished and with multiprocessing.Process.is_alive
you can check if a process is still running. You can decorate your function rename
with a callback function that is called after the function rename
has finished.
Here is an example for the callback:
import os
import sys
import glob
from multiprocessing import Process
cst_id = sys.argv[1]
data=sys.argv[2]
main_path="/scality/hmo02/data/fdata/"+data+"/isei/"+cst_id+"/"
nb_process=100
def rename(file_lst):
for f in file_lst:
split_path=os.path.split(f)
container_path=split_path[0]
dir_to_rename=split_path[1]
if(dir_to_rename.startswith('001_')):
pass
else:
new_path=container_path+"/001_"+dir_to_rename
os.rename(f,new_path)
def cb():
print('Process has finished')
def wrapper(file_lst):
rename(file_lst)
cb()
def chunks(l, n):
return [l[x: x+n] for x in range(0, len(l), n)]
if(data=="odata"):
id_data_file_lst=glob.glob(main_path+"*/*/*/*")
chunked_lst=chunks(id_data_file_lst,nb_process)
proc_lst=[]
for lst in chunked_lst:
proc=Process(target=wrapper, args=(lst))
proc.start()
proc_lst.append(proc)
for p in proc_lst:
p.join()
Upvotes: 3