Markus M
Markus M

Reputation: 11

Google Cloud ML: Use Nightly TF Import Error No Module tensorflow

I want to train the NMT model from Google on Google Cloud ML. NMT Model

Now I put all input data in a bucket and downloaded the git repository. The model needs the nightly version of tensorflow so I defined it in setup.py and when I use the cpu version tf-nightly==1.5.0-dev20171115 and run the following command to run it in GCP local it works.

Train local on google:

gcloud ml-engine local train --package-path nmt/ \
                             --module-name nmt.nmt \
                             -- --src=en --tgt=de \
                             --hparams_path=$HPARAMAS_PATH \
                             --out_dir=$OUTPUT_DIR \
                             --vocab_prefix=$VOCAB_PREFIX \
                             --train_prefix=$TRAIN_PREFIX \
                             --dev_prefix=$DEV_PREFIX \
                             --test_prefix=$TEST_PREFIX

Now when I use the gpu version with the following command I got this error message few minutes after submitting the job.

Train on cloud

gcloud ml-engine jobs submit training $JOB_NAME \
                             --runtime-version 1.2 \
                             --job-dir $JOB_DIR \
                             --package-path nmt/ \
                             --module-name nmt.nmt \
                             --scale-tier BAISC_GPU \
                             --region $REGION \
                             -- --src=en --tgt=de \
                             --hparams_path=$HPARAMAS_PATH \
                             --out_dir=$OUTPUT_DIR \
                             --vocab_prefix=$VOCAB_PREFIX \
                             --train_prefix=$TRAIN_PREFIX \
                             --dev_prefix=$DEV_PREFIX \
                             --test_prefix=$TEST_PREFIX

Error: import tensorflow as tf ImportError: No module named tensorflow

setup.py:

from setuptools import find_packages
from setuptools import setup
REQUIRED_PACKAGES = ['tf-nightly-gpu==1.5.0-dev20171115']
setup(
        name="nmt",
        install_requires=REQUIRED_PACKAGES,
        packages=find_packages(),
        include_package_data=True,
        version='0.1.2'
)

Thank you all in advance Markus

Update:

I have found a note on GCP docs Note: Training with TensorFlow versions 1.3+ is limited to CPUs only. See the Cloud ML Engine release notes for updates.

So it seems to doesn't work currently I think I have to go with the compute engine.

Or is there any hack to got it working?

However thank you for your help

Upvotes: 1

Views: 399

Answers (1)

Guoqing Xu
Guoqing Xu

Reputation: 482

The TensorFlow 1.5 might need newer version of CUDA (i.e., CUDA 9), and but the version CloudML Engine installed is CUDA 8. Can you please try to use TensorFlow 1.4 instead, which works on CUDA 8? Please tell us if 1.4 works for you here or send us an email via [email protected]

Upvotes: 0

Related Questions