Reputation: 8788
So I am trying to use logging
in a multiprocessing
context. Say I want to have each process write its own log file. Here is my setup:
# worker.py
import logging
import os
def worker( x ) :
"""
Write the value of x in the log file
"""
logger = logging.getLogger( __name__ )
pid = os.getpid() # get the process id
handler = logging.FileHandler( str(pid) + ".log" )
logger.addHandler( handler )
logger.info( "pid={pid}, x={x}".format( **locals() ) )
This is my main
:
import logging
import multiprocessing as mp
import worker
# logger
logger = logging.getLogger()
logger.setLevel( logging.INFO )
# handler
handler = logging.FileHandler( 'main.log' )
logger.addHandler( handler )
#
if __name__ == '__main__' :
pool = mp.Pool( processes=2 )
pool.map( worker.worker, range(5) )
pool.close()
pool.join()
And now the output log files. This is main.log
:
pid=1985, x=0
pid=1985, x=2
pid=1986, x=1
pid=1985, x=3
pid=1986, x=4
I think this output is normal: the two child processes propagate events to the root parent. So it does not correspond to the Logging to a single file from multiple processes scenario (does it?). But now here is the first child process log file:
pid=1985, x=0
pid=1985, x=2
pid=1985, x=2
pid=1985, x=3
pid=1985, x=3
pid=1985, x=3
And this is the second child log file:
pid=1986, x=1
pid=1986, x=4
pid=1986, x=4
It looks like each child process writes the log once for the first input, twice for the second, three times for the third, etc. What's going on?
Upvotes: 2
Views: 963
Reputation: 94941
This is happening because every time you enter worker
, you're getting a reference to the same logger
object, and then add a new FileHandler
to it. So if worker
is called three times, its logger
ends up having three FileHandlers
, which all happen to write to the same file. You should use the initializer
keyword argument to set up the logger in each Pool
process once at startup:
def init( ) :
logger = logging.getLogger( __name__ )
pid = os.getpid() # get the process id
handler = logging.FileHandler( str(pid) + ".log" )
logger.addHandler( handler )
Then start the pool like this
p = multiprocessing.Pool(processes=2, initializer=init)
Then worker can just grab the logger
object and use it without adding a FileHandler
to it.
Upvotes: 4