SilverBullet
SilverBullet

Reputation: 21

Python logging in child processes

I would like to create one log file per child process using joblib Parallel in python. However, when I tried the following simplest example, only a few log files were created and all the logging messages were inserted into the files in random order. Anything I am doing wrong? This is Python 3.7.1 and joblib version is 0.13.2.

from joblib import Parallel, delayed
import os

def func(i):
    import logging
    logging.basicConfig(filename='logs/%d.txt' % i)
    logger = logging.getLogger()
    logger.warning('This is task %d' % i)

Parallel(n_jobs=4)(delayed(func)(i) for i in range(16))

for f in os.listdir('logs'):
    r = open('logs/' + f, 'r')
    print(f, '\n', r.read())
    r.close()

The output is:

1.txt 
 WARNING:root:This is task 1
WARNING:root:This is task 13
WARNING:root:This is task 15

0.txt 
 WARNING:root:This is task 0
WARNING:root:This is task 2
WARNING:root:This is task 3
WARNING:root:This is task 4
WARNING:root:This is task 5
WARNING:root:This is task 6
WARNING:root:This is task 7
WARNING:root:This is task 8
WARNING:root:This is task 10
WARNING:root:This is task 11
WARNING:root:This is task 12
WARNING:root:This is task 14

9.txt 
 WARNING:root:This is task 9

Upvotes: 2

Views: 2062

Answers (2)

Magnus Persson
Magnus Persson

Reputation: 811

If I understand the logging docs correct logging.basicConfig still applies to the "root" logger, which is the problem. What then happens is it depends on where and when each new process is spawned if it goes in file 0.txt or 1.txt in your code (as answered by @IronMan). Thus you need to add a FileHandler to each new logger that you create. Something like this:

import logging
from joblib import Parallel, delayed
import os

def func(i):
    logger = logging.getLogger(f"logger_{i}")
    # change 'mode' if appropriate for your situation ('a' is for append)
    logger.addHandler(logging.FileHandler(f"logs/{i}.txt", mode='a')) 
    logger.warning(f'This is task {i}')               

Parallel(n_jobs=4)(delayed(func)(i) for i in range(16))

for f in os.listdir('logs'):
    r = open('logs/' + f, 'r')
    print(f, '\n', r.read())
    r.close()

For more info check the howto in the official docs: https://docs.python.org/3/howto/logging.html

Hope it works (better late answer than never).

Upvotes: 3

IronMan
IronMan

Reputation: 1960

basicConfig does nothing if the root handlers are already configured, so it will take effect once per process. The file will be created based on the first argument passed in to the process. The distribution of tasks would depend on how fast the processes consume each task, which could be random - so you're not guaranteed any particular number of processes or tasks per process.

Upvotes: 0

Related Questions