Reputation: 160
I would like to serve about ~600 models with Tensorflow Serving.
I am trying to find a solution to eventually reduce the number of models:
My models have the same architecture, only the weights changes. Is it possible to load only one model and changing the weights?
Would it be possible to aggregate all those models together and effectively, the first layer of the model would be an ID and the input features for that model?
Has anyone tried having couple of hundreds models running on one machine? I have find this cortex solution, but wanted to avoid using another tech. https://towardsdatascience.com/how-to-deploy-1-000-models-on-one-cpu-with-tensorflow-serving-ec4297bff54b
Upvotes: 0
Views: 179
Reputation: 1
If the models have the same architecture but different weight, you can try merging all those model into a "super model". However I would need to know more about the task to see if that's possible.
To serve 600 models, you would need a very powerful machine and lot of memory (depending on how big your models are and how much you use them in parallel).
You can either run TFServe yourself, or use a provider such as Inferrd.com/Google/AWS.
Upvotes: 0