tanlccc
tanlccc

Reputation: 363

Tensorflow and Hadoop Deployment

As Hadoop Cloudera deployment works in Infra nodes and Data nodes, where should Tensorflow be deployed using the same hardware configuration? In Infra nodes or Data nodes?

As Tensorflow requires GPU, need to know where to deploy so that I would know which node to add the GPU.

Upvotes: 0

Views: 621

Answers (2)

zhz
zhz

Reputation: 171

https://github.com/linkedin/TonY

With TonY, you can submit a TensorFlow job and specify number of workers and whether they require CPUs or GPUs.

Below is an example of how to use it from the README:

In the tony directory there’s also a tony.xml which contains all of your TonY job configurations. For example:

$ cat tony/tony.xml
<configuration>
  <property>
    <name>tony.worker.instances</name>
    <value>4</value>
  </property>
  <property>
    <name>tony.worker.memory</name>
    <value>4g</value>
  </property>
  <property>
    <name>tony.worker.gpus</name>
    <value>1</value>
  </property>
  <property>
    <name>tony.ps.memory</name>
    <value>3g</value>
  </property>
</configuration>

For a full list of configurations, please see the wiki.

Model code
$ ls src/models/ | grep mnist_distributed
  mnist_distributed.py

Then you can launch your job:

$ java -cp "`hadoop classpath --glob`:tony/*:tony" \
            com.linkedin.tony.cli.ClusterSubmitter \
            -executes src/models/mnist_distributed.py \
            -task_params '--input_dir /path/to/hdfs/input --output_dir /path/to/hdfs/output --steps 2500 --batch_size 64' \
            -python_venv my-venv.zip \
            -python_binary_path Python/bin/python \
            -src_dir src \
            -shell_env LD_LIBRARY_PATH=/usr/java/latest/jre/lib/amd64/server

The command line arguments are as follows: * executes describes the location to the entry point of your training code. * task_params describe the command line arguments which will be passed to your entry point. * python_venv describes the name of the zip locally which will invoke your python script. * python_binary_path describes the relative path in your python virtual environment which contains the python binary, or an absolute path to use a python binary already installed on all worker nodes. * src_dir specifies the name of the root directory locally which contains all of your python model source code. This directory will be copied to all worker nodes. * shell_env specifies key-value pairs for environment variables which will be set in your python worker/ps processes.

Upvotes: 2

fjxx
fjxx

Reputation: 945

Tensorflow can use either CPU or GPU for training but does not require GPU for classification. Here are two good guides on running Tensorflow on Hadoop and YARN:

https://www.tensorflow.org/deploy/hadoop

https://hortonworks.com/blog/distributed-tensorflow-assembly-hadoop-yarn/

Upvotes: 0

Related Questions