Optimize batch transform inference on sagemaker

Question

With current batch transform inference I see a lot of bottlenecks,

Each input file can only have close to 1000 records
Currently it is processing 2000/min records on 1 instance of ml.g4dn.12xlarge
GPU instance are not necessarily giving any advantage over cpu instance.

I wonder if this is the existing limitation of the currently available tensorflow serving container v2.8. If thats the case config should I play with to increase the performance

i tried changing max_concurrent_transforms but doesn't seem to really help

my current config

transformer = tensorflow_serving_model.transformer(
    instance_count=1,
    instance_type="ml.g4dn.12xlarge",
    max_concurrent_transforms=0,
    output_path=output_data_path,
)

transformer.transform(
    data=input_data_path,
    split_type='Line',
    content_type="text/csv",
    job_name = job_name + datetime.now().strftime("%m-%d-%Y-%H-%M-%S"),
)

Optimize batch transform inference on sagemaker

Answers (1)

Related Questions