Reputation: 87
I know that loading a .csv file into sagemaker notebook from S3 bucket is pretty straightforward but I want to load a model.tar.gz file stored in S3 bucket. I tried to do the following
import botocore
import sagemaker
from sagemaker import get_execution_role
from sagemaker.predictor import csv_serializer
import boto3
sm_client = boto3.client(service_name='sagemaker')
runtime_sm_client = boto3.client(service_name='sagemaker-runtime')
s3 = boto3.resource('s3')
s3_client = boto3.client('s3')
sagemaker_session = sagemaker.Session()
role = get_execution_role()
ACCOUNT_ID = boto3.client('sts').get_caller_identity()['Account']
REGION = boto3.Session().region_name
BUCKET = 'sagemaker.prismade.net'
data_key = 'DEMO_MME_ANN/multi_model_artifacts/axel.tar.gz'
loc = 's3://{}/{}'.format(BUCKET, data_key)
print(loc)
with tarfile.open(loc) as tar:
tar.extractall(path='.')
I get the following error:
--------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-215-bfdddac71b95> in <module>()
20 loc = 's3://{}/{}'.format(BUCKET, data_key)
21 print(loc)
---> 22 with tarfile.open(loc) as tar:
23 tar.extractall(path='.')
~/anaconda3/envs/python3/lib/python3.6/tarfile.py in open(cls, name, mode, fileobj, bufsize, **kwargs)
1567 saved_pos = fileobj.tell()
1568 try:
-> 1569 return func(name, "r", fileobj, **kwargs)
1570 except (ReadError, CompressionError):
1571 if fileobj is not None:
~/anaconda3/envs/python3/lib/python3.6/tarfile.py in gzopen(cls, name, mode, fileobj, compresslevel, **kwargs)
1632
1633 try:
-> 1634 fileobj = gzip.GzipFile(name, mode + "b", compresslevel, fileobj)
1635 except OSError:
1636 if fileobj is not None and mode == 'r':
~/anaconda3/envs/python3/lib/python3.6/gzip.py in __init__(self, filename, mode, compresslevel, fileobj, mtime)
161 mode += 'b'
162 if fileobj is None:
--> 163 fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
164 if filename is None:
165 filename = getattr(fileobj, 'name', '')
FileNotFoundError: [Errno 2] No such file or directory: 's3://sagemaker.prismade.net/DEMO_MME_ANN/multi_model_artifacts/axel.tar.gz'
What is the mistake here and how can I accomplish this?
Upvotes: 6
Views: 4246
Reputation: 12891
Not every python library that is designed to work with a file system (tarfile.open, in this example) knows how to read an object from S3 as a file.
The simple way to solve it is to first copy the object into the local file system as a file.
import boto3
s3 = boto3.client('s3')
s3.download_file('BUCKET_NAME', 'OBJECT_NAME', 'FILE_NAME')
Upvotes: 8