Reputation: 21
I want to run the Distributed Tensorflow on GKE. You want a sample of up to run of Distributed TensorFlow from the setting of GKE. Do you know a good sample?
Upvotes: 2
Views: 587
Reputation: 6776
If you want to run TensorFlow on Google's Cloud Platform one option is Google Cloud Machine Learning.
Upvotes: 0
Reputation: 126154
A recent workshop (slides) at OSCON and PyCon covered (among other things) running distributed TensorFlow on Kubernetes. There is a GitHub repository including the necessary configuration scripts and a Jupyter notebook that can be used to interact with the cluster.
See the workshop for more details, but the basic idea is that the master, each worker, and each parameter server runs in a separate Kubernetes replication controller of size 1. Kubernetes gives stable names to each of these processes, which you can use to build a tf.train.ClusterSpec
, and interconnect the different processes.
There are other ways to set up a cluster, which require more configuration, but the tutorial gives a nice introduction to setting up synchronous training on a word2vec model.
Upvotes: 2