bark
bark

Reputation: 51

Parallel Computing-output each process to file

I have a function that I would like to compute for several values in parallel and for each computation, I would like to save the output to its own file.

I am currently trying to use the multiprocessing package in order to make this a parallel process, but I am new to using this and perhaps I am not using it correctly. (I am uncertain about the use of Pool vs Process classes) Also, I know how to print my output using the bash command below. This is in essence a problem about how to implement parallel computing and outputting the results to txt files.

Here is a toy example that I am currently working with: I have this function save in a file called test.py

def fcn(n):
    c = int(n*(n-1)/2)
    print('My output will print several things like this code')
    print(c)
    return

Normally when I write the output of a function to a file, I use the bash command

python test.py > output.txt

For parallel computing, the sample code that I have would something that writes each output to an element in a list, but it does not write the output to a file, which is what I would like. (My actual problem will have a lot stored in memory if I keep all outputs stored as elements of a list like this. I would like it to just write to a file and move on to the next computation)

import multiprocessing as mp

pool = mp.Pool(mp.cpu_count())
results = [pool.apply(fcn,args = n, for n in range(6)]
pool.close()

The output I would like to have is 6 different txt files with the outputs of fcn(n) for n=0,1,2,3,4,5 for each one. Each of the files I would like to be named after the input, for example 0.txt,1.txt, etc, if possible. Any insight would be greatly appreciated!

Upvotes: 1

Views: 824

Answers (2)

user3666197
user3666197

Reputation: 1

Q : ...would like to save the output to its own file.

For the sake of simplicity, let's do it simple & a bit safer :

Given the task, each of the running processes is independent from any other and has its own, exclusively owned file-I/O directed to a "private", numbered or otherwise differentiated file.

Just may add a few SLOC-s, using a with... as aFile: context-enclosed file-I/O and most probably a few try:/except:/finally: handling sections for cases something wrong happens on the fly:

def fcn( n ):
    try:                                                               #.________
        with open( "aFileFromPROC[{0:}].txt".format( n ), "w" ) as aF: #| context
             ...do.whatever.needed...                            # aF  #|
             aF.write(...)                                       # aF  #|
             ...do.whatever.needed...                            # aF  #|
             aF.write(...)                                       # aF  #|
             ...do.whatever.needed...                            # aF  #|
             aF.write(...)                                       # aF  #|
             pass                                                # aF  #|________
    except:
        ...
    finally:
        ...
        return

Upvotes: 1

Grzegorz Bokota
Grzegorz Bokota

Reputation: 1804

See my response here: Multiprocessing for Pandas Dataframe write to excel sheets

This idea is to create writer worker and pipe results through multiprocessing.Queue. (this idea is needed when you would like to put all responses in single nontrivial, like excel, file format)

Other solution is to write to separated files:

def fcn(n):
    c = int(n*(n-1)/2)
    with open("file{}.txt".format(n), 'w') as ff:
        ff.write('My output will print several things like this code\n')
        ff.write(str(c)+'\n')
    return

Upvotes: 0

Related Questions