Kyle Sargent
Kyle Sargent

Reputation: 161

tf.test.is_gpu_available() is False in subprocess but True in main process

I'm currently running a pytorch model which periodically calls out to a tensorflow model for benchmarking purposes. I'd like both of these models to be GPU-enabled and to run in the same script. Since tensorflow benchmarking code claims GPU memory til the end of the process, I've elected to run the benchmarking code in a multiprocessing.Process so that my pytorch model can use the full GPU's memory after the benchmarking script has run.

During this, I've stumbled across an unusual bug (?) in tensorflow's gpu utilization. It seems that tensorflow run in a subprocess doesn't want to use a GPU which has been used ~at all~ by a parent process. I can have tensorflow models and pytorch models in the same GPU and process with no problems, but when I introduce subprocesses tensorflow is ill-behaved.

I'm running tensorflow-gpu==1.14.0 torch==1.1.0 cudatoolkit=10.0 on an NVIDIA 2080-Ti.

Below is a minimal code snipped to reproduce:

import torch
import tensorflow as tf
from multiprocessing import Process

def f():
    print(tf.test.is_gpu_available())

pa = Process(target=f, args=())
pa.start()
pa.join()

torch.ones(1).cuda()

pb = Process(target=f, args=())
pb.start()
pb.join()
>>> True
>>> False

Upvotes: 0

Views: 703

Answers (1)

Kyle Sargent
Kyle Sargent

Reputation: 161

To anyone running into this problem, you need to call multiprocessing.set_start_method('spawn'). Tensorflow is not fork-safe and some weirdness can happen with global variables/modules that is probably very hard to reason about. Remember to call it only once, inside a if __name__ == '__main__': check.

Upvotes: 2

Related Questions