Running Triton Server Inference on AWS GPU Graviton instance

Question

I am currently running a Triton server in production on AWS Cloud using a standard GPU enabled EC2 (very expensive).

I have seen these new GPU enabled Graviton instances can be 40% cheaper to run. However, they run on ARM (not AMD). Does this mean I can run the standard version of Triton server on this instance?

Looking at Triton server release notes, I have seen it can run on jetson nano, which is nvidia gpu ARM https://github.com/triton-inference-server/server/releases/tag/v1.12.0

Does this method reduce my costs? Can I run Triton server on these graviton instances?

Does performance drop using these instances?

Running Triton Server Inference on AWS GPU Graviton instance

Answers (1)

Related Questions