Reputation: 664
I'm running a local transformer using sagemaker and using batch transform. However, it seems like the transform is not calling my custom code.
The following is SKlearn init
from sagemaker.sklearn.estimator import SKLearn
source_dir = 'train'
script_path = 'train.py'
sklearn = SKLearn(
entry_point=script_path,
train_instance_type="local_gpu",
source_dir=source_dir,
role=role,
sagemaker_session=sagemaker_session)
sklearn.fit({'train': "file://test.csv"})
train.py is a python script that loads the training data, and saves the model to S3
the batch transform is:
transformer = sklearn.transformer(instance_count=1,
entry_point=source_dir+"/"+script_path,
instance_type='local_gpu',
strategy='MultiRecord',
assemble_with='Line'
)
transformer.transform("file://test_messages", content_type='text/csv', split_type='Line')
print('Waiting for transform job: ' + transformer.latest_transform_job.job_name)
transformer.wait()
file://test_messages
contains a csv that is a list of strings
The full error is
algo-1-6c5rl_1 | 172.18.0.1 - - [30/Jan/2020:14:14:30 +0000] "GET /ping HTTP/1.1" 200 0 "-" "-"
algo-1-6c5rl_1 | 172.18.0.1 - - [30/Jan/2020:14:14:30 +0000] "GET /execution-parameters HTTP/1.1" 404 232 "-" "-"
algo-1-6c5rl_1 | 2020-01-30 14:14:30,846 ERROR - train - Exception on /invocations [POST]
algo-1-6c5rl_1 | Traceback (most recent call last):
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_functions.py", line 93, in wrapper
algo-1-6c5rl_1 | return fn(*args, **kwargs)
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/serving.py", line 56, in default_input_fn
algo-1-6c5rl_1 | return np_array.astype(np.float32) if content_type in content_types.UTF8_TYPES else np_array
algo-1-6c5rl_1 | ValueError: could not convert string to float: 'IMPORTANT - You could be entitled up to �3,160 in compensation from mis-sold PPI on a credit card or loan. Please reply PPI for info or STOP to opt out.'
algo-1-6c5rl_1 |
algo-1-6c5rl_1 | During handling of the above exception, another exception occurred:
algo-1-6c5rl_1 |
algo-1-6c5rl_1 | Traceback (most recent call last):
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
algo-1-6c5rl_1 | response = self.full_dispatch_request()
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
algo-1-6c5rl_1 | rv = self.handle_user_exception(e)
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception
algo-1-6c5rl_1 | reraise(exc_type, exc_value, tb)
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
algo-1-6c5rl_1 | raise value
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
algo-1-6c5rl_1 | rv = self.dispatch_request()
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request
algo-1-6c5rl_1 | return self.view_functions[rule.endpoint](**req.view_args)
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_transformer.py", line 200, in transform
algo-1-6c5rl_1 | self._model, request.content, request.content_type, request.accept
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_transformer.py", line 227, in _default_transform_fn
algo-1-6c5rl_1 | data = self._input_fn(content, content_type)
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_functions.py", line 95, in wrapper
algo-1-6c5rl_1 | six.reraise(error_class, error_class(e), sys.exc_info()[2])
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/six.py", line 692, in reraise
algo-1-6c5rl_1 | raise value.with_traceback(tb)
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/sagemaker_containers/_functions.py", line 93, in wrapper
algo-1-6c5rl_1 | return fn(*args, **kwargs)
algo-1-6c5rl_1 | File "/miniconda3/lib/python3.7/site-packages/sagemaker_sklearn_container/serving.py", line 56, in default_input_fn
algo-1-6c5rl_1 | return np_array.astype(np.float32) if content_type in content_types.UTF8_TYPES else np_array
algo-1-6c5rl_1 | sagemaker_containers._errors.ClientError: could not convert string to float: 'IMPORTANT - You could be entitled up to �3,160 in compensation from mis-sold PPI on a credit card or loan. Please reply PPI for info or STOP to opt out.'
algo-1-6c5rl_1 | 172.18.0.1 - - [30/Jan/2020:14:14:30 +0000] "POST /invocations HTTP/1.1" 500 290 "-" "-"
.Waiting for transform job: sagemaker-scikit-learn-2020-01-30-14-14-30-490
It seems that it is not able to process my string. I do have code in train.py to convert the string using TfidfVectorizer, but that code is not getting called
Upvotes: 2
Views: 2516
Reputation: 63
You can just use a modified version of default_input_fn
as your input_fn:
def input_fn(input_data, content_type):
return encoders.decode(input_data, content_type)
This worked out for me.
Upvotes: 0
Reputation: 51
I'm an engineer at AWS SageMaker. Thanks for providing details from your Estimator/Transformer setup, as well as the full error log.
Looking at the specific error, it looks like the Scikit-learn container failed in default_input_fn
. Thankfully SageMaker Scikit-learn is open-source so we can got straight to the source sagemaker_sklearn_container/serving.py#L56 to help understand how it works.
The container chose to execute the "default" input function to process input before sending to the model. Obviously the default implementation is not working for your desired input format.
Similar to training, you need to provide custom Python code to control how SageMaker Scikit-learn handles your model in serving/inference mode. If you would like to override the default, you need to implement input_fn
in your custom Python code. You may choose to add this to your train.py
script, or pass a different Python file in the Transformer.
This doc should be helfpul in writing the input_fn
: https://sagemaker.readthedocs.io/en/stable/using_sklearn.html#process-input
If you still have problems, you may share examples from your custom code.
Upvotes: 5