Reputation: 21
With current batch transform inference I see a lot of bottlenecks,
I wonder if this is the existing limitation of the currently available tensorflow serving container v2.8. If thats the case config should I play with to increase the performance
i tried changing max_concurrent_transforms but doesn't seem to really help
my current config
transformer = tensorflow_serving_model.transformer(
instance_count=1,
instance_type="ml.g4dn.12xlarge",
max_concurrent_transforms=0,
output_path=output_data_path,
)
transformer.transform(
data=input_data_path,
split_type='Line',
content_type="text/csv",
job_name = job_name + datetime.now().strftime("%m-%d-%Y-%H-%M-%S"),
)
Upvotes: 1
Views: 1778
Reputation: 5578
Generally speaking, you should first have a performing model (steps 1+2 below) yielding a satisfactory TPS, before you move over to batch transform parallelization techniques to push your overall TPS higher with parallization nobs.
Steps:
Upvotes: 2