dragster
dragster

Reputation: 448

Tensorflow serving : Using a fraction of GPU memory for each model

I have one GPU at my disposal for deployment but multiple models need to be deployed. I don't want to allocate the full GPU memory to the first deployed model because then I can't deploy my subsequent models. While training, this could be controlled using gpu_memory_fraction parameter. I am using the following command to deploy my model -

tensorflow_model_server --port=9000 --model_name=<name of model> --model_base_path=<path where exported models are stored &> <log file path>

Is there a flag that I can set to control the gpu memory allocation?

Thanks

Upvotes: 3

Views: 2288

Answers (2)

Dat
Dat

Reputation: 5853

The new TF Serving allowed to set flag per_process_gpu_memory_fraction in this pull request

Upvotes: 3

John Zhou
John Zhou

Reputation: 11

I have just add one flag to config gpu memory fraction. https://github.com/zhouyoulie/serving

Upvotes: 1

Related Questions