Reputation: 780
I am trying to do a task in parallel. It is a basic task and I am just trying to explore multiprocessing in python. I have 35 files, I am trying to do some file formatting on those 35 files and write them to new files. I have written the below code :
import csv
import json
import os
import multiprocessing as mp
path = '<somepath>'
total_csv_file_list = []
for filename in os.listdir(path+'csv_3'):
total_csv_file_list.append(os.path.join(path+'csv_3',filename))
print(total_csv_file_list)
def run_sed (path):
path_csv=str(path)+'csv_3'
path_json=str(path)+'json'
for filename in path:
# print("sed '1s/^/[/;$!s/$/,/;$s/$/]/' "+ (filename)+'>'+(filename.split('.')[0]+'.json'))
os.system("sed '1s/^/[/;$!s/$/,/;$s/$/]/' "+ (filename)+'>'+(filename.split('.')[0]+'.json'))
#run_sed(total_csv_file_list)
p = mp.Pool(processes=mp.cpu_count())
total_file_list = p.map(run_sed,total_csv_file_list)
p.close()
p.join()
I have printed out the sed command before implementing the multiprocessing and it is getting formed correctly and is running as well when i try to run it from my shell. But when i implement it through the above code I am getting the below error :
sed: can't read r: No such file or directory
sed: can't read A: No such file or directory
sed: can't read i: No such file or directory
sed: can't read k: No such file or directory
Any help would be greatly appreciated.
Upvotes: 1
Views: 84
Reputation: 59208
If path
is a string, then for filename in path:
will run over each character of the string.
Note that the list of file name that you pass in this line
p.map(run_sed,total_csv_file_list)
will already be separated into single file names by the map()
function, so there's no need to loop over it again.
Concrete action: in def run_sed (path):
get rid of the whole line for filename in path:
Upvotes: 2