user8574993
user8574993

Reputation: 81

Tensorflow - Step-by-Step getting a model into productivity environment

at the moment I dont have a clue how to get my trained model into a productivity environement.

I did read this article on medium: https://medium.com/zendesk-engineering/how-zendesk-serves-tensorflow-models-in-production-751ee22f0f4b

and they say that they use the tensorflow serving api c++

And if I understand correctly it's like a let's say local server which loads a model on my local system and would be accecissble over a client (doesnt matter which programming language) which supports gRPC and voila I have my predictions... Is this right?

Let's say I have a C#/.Net environment and I want to have some predictions in movies, do I have "just" to use the gRPC protocoll and the tensorflow serving API?

Are there other possibilities to get my trained model into a productivity environment? What are your steps? Any help and advice is appreciated! Thanks!

Upvotes: 1

Views: 3195

Answers (2)

mäggy
mäggy

Reputation: 96

Besides using tensorflow serving there are APIs to integrate your model directly into your program. An API for C# can be found here. You might also find some useful pointers to other c# examples on this thread.

The basic steps for using the model in an application it was not trained in are the same for any API. First you export your trained model into a .pb file e.g. by using the freeze_graph function. In your application you define a new graph, read its definition from the file, start a session with it, and then you feed and run it. A code example of that in C# can be found in the readme of the API I linked to above. The difficult part is usually to figure out how you to convert your inputs into a form tensorflow can handel.

Upvotes: 2

Sorin
Sorin

Reputation: 11968

There are many ways to use a trained model.

gRPC to a tensorflow serving is one way. It's nice that you can use them from almost any language/platform, but the setup is a bit tedious since you need to start this other process. This might be advantageous if you are running on many machines and you don't want to load the model on each of them or you already have some distributed system in place.

Another way to do it would be to write a small C++ wrapper library that you link in your C# application. The advantage is that the model now runs inside your process. You can also pass your custom classes along and not bother with gRPC. I don't think there's C# api for tensorflow so unfortunately you have to write some C++.

If you want to go all out you could dump the weights of the trained model and apply them yourself in your application. For small models this might give you the best performance since you bypass the tensorflow checks/threads/etc. Though it's fairly brittle (any change in the model structure must be reflected in your code or nothing works). For larger model the computation speed will be the dominant factor so this approach is not worth it.

In any case, the most important part is to make sure that the inputs you give are consistent with what you trained the model on. If you do any processing steps they must be the same for both training and inference.

Upvotes: 0

Related Questions