Reputation: 21
I would like to create one log file per child process using joblib
Parallel in python. However, when I tried the following simplest example, only a few log files were created and all the logging messages were inserted into the files in random order. Anything I am doing wrong? This is Python 3.7.1 and joblib version is 0.13.2.
from joblib import Parallel, delayed
import os
def func(i):
import logging
logging.basicConfig(filename='logs/%d.txt' % i)
logger = logging.getLogger()
logger.warning('This is task %d' % i)
Parallel(n_jobs=4)(delayed(func)(i) for i in range(16))
for f in os.listdir('logs'):
r = open('logs/' + f, 'r')
print(f, '\n', r.read())
r.close()
The output is:
1.txt
WARNING:root:This is task 1
WARNING:root:This is task 13
WARNING:root:This is task 15
0.txt
WARNING:root:This is task 0
WARNING:root:This is task 2
WARNING:root:This is task 3
WARNING:root:This is task 4
WARNING:root:This is task 5
WARNING:root:This is task 6
WARNING:root:This is task 7
WARNING:root:This is task 8
WARNING:root:This is task 10
WARNING:root:This is task 11
WARNING:root:This is task 12
WARNING:root:This is task 14
9.txt
WARNING:root:This is task 9
Upvotes: 2
Views: 2062
Reputation: 811
If I understand the logging docs correct logging.basicConfig
still applies to the "root" logger, which is the problem. What then happens is it depends on where and when each new process is spawned if it goes in file 0.txt or 1.txt in your code (as answered by @IronMan). Thus you need to add a FileHandler to each new logger that you create. Something like this:
import logging
from joblib import Parallel, delayed
import os
def func(i):
logger = logging.getLogger(f"logger_{i}")
# change 'mode' if appropriate for your situation ('a' is for append)
logger.addHandler(logging.FileHandler(f"logs/{i}.txt", mode='a'))
logger.warning(f'This is task {i}')
Parallel(n_jobs=4)(delayed(func)(i) for i in range(16))
for f in os.listdir('logs'):
r = open('logs/' + f, 'r')
print(f, '\n', r.read())
r.close()
For more info check the howto in the official docs: https://docs.python.org/3/howto/logging.html
Hope it works (better late answer than never).
Upvotes: 3
Reputation: 1960
basicConfig
does nothing if the root handlers are already configured, so it will take effect once per process. The file will be created based on the first argument passed in to the process. The distribution of tasks would depend on how fast the processes consume each task, which could be random - so you're not guaranteed any particular number of processes or tasks per process.
Upvotes: 0