Reputation: 1478
I try to read from tfrecords in S3 from a Sage Maker notebook instance following instructions here: https://www.tensorflow.org/versions/master/deploy/s3
import tensorflow as tf
import os
os.environ['AWS_ACCESS_KEY_ID'] = '<my-key>'
os.environ['AWS_SECRET_ACCESS_KEY'] = '<my-secret>'
from tensorflow.python.lib.io import file_io
print(file_io.stat('s3://<my-bucket>/data/DEMO-mnist/train.tfrecords'))
The above code fails with the error:
---------------------------------------------------------------------------
NotFoundError Traceback (most recent call last)
<ipython-input-7-770c0aef6d7b> in <module>()
1 from tensorflow.python.lib.io import file_io
----> 2 print(file_io.stat('s3://<my-bucket>/data/DEMO-mnist/train.tfrecords'))
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py in stat(filename)
551 with errors.raise_exception_on_not_ok_status() as status:
552 pywrap_tensorflow.Stat(compat.as_bytes(filename), file_statistics, status)
--> 553 return file_statistics
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
517 None, None,
518 compat.as_text(c_api.TF_Message(self.status.status)),
--> 519 c_api.TF_GetCode(self.status.status))
520 # Delete the underlying status object from memory otherwise it stays alive
521 # as there is a reference to status from this from the traceback due to
NotFoundError: Object s3://<my-bucket>/data/DEMO-mnist/train.tfrecords does not exist
However the same code works fine if I run from a regular EC2 instance without using SageMaker.
IAM role used for the notebook instance has full S3 access.
Upvotes: 0
Views: 980
Reputation: 176
I reproduced the problem in us-west-2.
But after I manually export environment variable AWS_REGION='us-west-2', it worked.
Also I tried not exporting AWS_REGION and tested on a us-east-1 bucket. It worked too.
So for some reason, the region info in aws profile is not retrieved and used. If environment variable AWS_REGION is not used, it will be always us-east-1, the default.
Upvotes: 2