Sagemaker Batch transform with multiple images

Question

The batch transform with mini-batch > 1 of images doesn't work fo as I expect.

I'm using Sagemaker Batch transform for inference. I'm trying to preprocess images on a custom container that I created (using model pipelining: the first model is the pre-processor that I'm asking about, and the second model is an Nvidia-triton inference server).

I'm using batch transform as follows:

transformer = sagemaker.transformer.Transformer(model_name=model_name, 
                  instance_count=instance_count, 
                  instance_type=instance_type,
                  max_concurrent_transforms=16,
                  output_path=inference_output_data,
                  strategy='MultiRecord')
transformer.transform(data=batch_input,
                  content_type='image/jpeg',
                  job_name=job_name,
                  split_type='Line',
                  wait=False,
                  logs=False)

Note the split type and strategy with values 'MultiRecord' and 'Line' so I would get mini-batches > 1.

When I did my sanity check with batch size of 1, the following code worked fine:

@app.route('/invocations', methods=['POST'])
def transformation():
    img_bytes = flask.request.data
    img = Image.open(io.BytesIO(img_bytes))
    preprocessed_image = preprocess_image()

However, with batching, this code doesn't work anymore. Basically I would not expect it to work, since it only valid for reading for one image. I even printed the bytes that my container receives and these are not the original bytes of the image but some strange mix of the images.

Can you please point me what am I doing wrong?

Thanks.

Sagemaker Batch transform with multiple images

Answers (1)

Related Questions