Aritra Bhattacharya
Aritra Bhattacharya

Reputation: 780

Multiprocessing in Python performing parallel writes

I am trying to do a task in parallel. It is a basic task and I am just trying to explore multiprocessing in python. I have 35 files, I am trying to do some file formatting on those 35 files and write them to new files. I have written the below code :

import csv
import json
import os 
import multiprocessing as mp

path = '<somepath>'

total_csv_file_list = []
for filename in os.listdir(path+'csv_3'):
    total_csv_file_list.append(os.path.join(path+'csv_3',filename))
print(total_csv_file_list)    

def run_sed (path):
    path_csv=str(path)+'csv_3'
    path_json=str(path)+'json'
    for filename in path:
#       print("sed '1s/^/[/;$!s/$/,/;$s/$/]/' "+ (filename)+'>'+(filename.split('.')[0]+'.json'))
        os.system("sed '1s/^/[/;$!s/$/,/;$s/$/]/' "+ (filename)+'>'+(filename.split('.')[0]+'.json'))

#run_sed(total_csv_file_list)
p = mp.Pool(processes=mp.cpu_count())
total_file_list = p.map(run_sed,total_csv_file_list)
p.close()
p.join()

I have printed out the sed command before implementing the multiprocessing and it is getting formed correctly and is running as well when i try to run it from my shell. But when i implement it through the above code I am getting the below error :

sed: can't read r: No such file or directory
sed: can't read A: No such file or directory
sed: can't read i: No such file or directory
sed: can't read k: No such file or directory

Any help would be greatly appreciated.

Upvotes: 1

Views: 84

Answers (1)

Thomas Weller
Thomas Weller

Reputation: 59208

If path is a string, then for filename in path: will run over each character of the string.

Note that the list of file name that you pass in this line

p.map(run_sed,total_csv_file_list)

will already be separated into single file names by the map() function, so there's no need to loop over it again.

Concrete action: in def run_sed (path): get rid of the whole line for filename in path:

Upvotes: 2

Related Questions