Serving hundreds of models with Tensorflow serving

Question

I would like to serve about ~600 models with Tensorflow Serving.

I am trying to find a solution to eventually reduce the number of models:

My models have the same architecture, only the weights changes. Is it possible to load only one model and changing the weights?
Would it be possible to aggregate all those models together and effectively, the first layer of the model would be an ID and the input features for that model?
Has anyone tried having couple of hundreds models running on one machine? I have find this cortex solution, but wanted to avoid using another tech. https://towardsdatascience.com/how-to-deploy-1-000-models-on-one-cpu-with-tensorflow-serving-ec4297bff54b

Answers (1)