Reputation: 515
Using Amazon Sagemaker, I created an Xgboost model. After unpacking the resulting tar.gz file, I end up with a file "xgboost-model".
The next step will be to upload the model directly from my S3 bucket, without downloading it using pickle. Here is what I tried:
obj = client.get_object(Bucket='...',Key='xgboost-model')
xgb_model = pkl.load(open((obj['Body'].read())),"rb")
But it throws me the error:
TypeError: embedded NUL character
Also tried this:
xgb_model = pkl.loads(open((obj['Body'].read())),"rb")
the outcome was the same.
Another approach:
bucket='...'
key='xgboost-model'
with s3io.open('s3://{0}/{1}'.format(bucket, key),mode='w') as s3_file:
pkl.dump(mdl, s3_file)
This giving the error:
CertificateError: hostname bucket doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'
This although the bucket is the same.
How Can I upload the model in a pickle object so I can then use it it for predictions?
Upvotes: 1
Views: 3353
Reputation: 11
If you have trained the model using SageMaker's XGBoost built-in algorithm at one point and would like to use that model to do predictions in a Sagemaker environment at a later stage you use the estimator's 'attach' method.
Right after you fitted XGBoost you can use
model_job_name = xgb_model._current_job_name
to determine the training job's name. Alternatively you can go to the 'training job' section of the SageMaker dashboard and find the name of the job that you ran: training job dashboard
Later when you want to reuse the model you do:
import sagemaker
reloaded_xgb_model = sagemaker.estimator.Estimator.attach(model_job_name)
Upvotes: 0
Reputation: 1213
My assumption is you have trained the model using Sagemaker XGBoost built-in algorithm. You would like to use that model and do the predictions in your own hosting environment (not Sagemaker hosting).
pickle.load(file)
reads a pickled object from the open file object file and pickle.loads(bytes_object)
reads a pickled object from a bytes object and returns the deserialized object. Since you have the S3 object already downloaded (into memory) as bytes, you can use pickle.loads
without using open
xgb_model = pkl.loads(obj['Body'].read())
Upvotes: 1