Reputation: 166
The batch transform with mini-batch > 1 of images doesn't work fo as I expect.
I'm using Sagemaker Batch transform for inference. I'm trying to preprocess images on a custom container that I created (using model pipelining: the first model is the pre-processor that I'm asking about, and the second model is an Nvidia-triton inference server).
I'm using batch transform as follows:
transformer = sagemaker.transformer.Transformer(model_name=model_name,
instance_count=instance_count,
instance_type=instance_type,
max_concurrent_transforms=16,
output_path=inference_output_data,
strategy='MultiRecord')
transformer.transform(data=batch_input,
content_type='image/jpeg',
job_name=job_name,
split_type='Line',
wait=False,
logs=False)
Note the split type
and strategy
with values 'MultiRecord' and 'Line' so I would get mini-batches > 1.
When I did my sanity check with batch size of 1, the following code worked fine:
@app.route('/invocations', methods=['POST'])
def transformation():
img_bytes = flask.request.data
img = Image.open(io.BytesIO(img_bytes))
preprocessed_image = preprocess_image()
However, with batching, this code doesn't work anymore. Basically I would not expect it to work, since it only valid for reading for one image. I even printed the bytes that my container receives and these are not the original bytes of the image but some strange mix of the images.
Can you please point me what am I doing wrong?
Thanks.
Upvotes: 2
Views: 1555
Reputation: 494
In the code below -
transformer.transform(data=batch_input,
content_type='image/jpeg',
job_name=job_name,
split_type='Line',
wait=False,
logs=False)
I see that your input is using type "image/jpeg" but you set your split_type
as Line
, this method works well when you have lets say a large CSV/Parquet file with the delimiter
as line and when you want to process multiple lines at a time, you can use the MultiLine
strategy to read multiple lines at a time. Furthermore, you can have multiple instances and more smaller chunks of file to process them in parallel or distributed fashion. Let's say you have 10 input files and 2 instances, then each instance processes 5 files and speeds up your BT job.
Coming back to your scenario, I don't think BT is a good fit, given that you are not going to utilizing the native capabilities of the service. I would recommend to check out SageMaker Async Inference that works with larger payloads( in the order of GBs) and furthermore, it allows you to scale down the number of instances to Zero and maximize your compute resources.
Attaching a blog here that can help
Upvotes: 1