jtm123
jtm123

Reputation: 11

Running Triton Server Inference on AWS GPU Graviton instance

I am currently running a Triton server in production on AWS Cloud using a standard GPU enabled EC2 (very expensive).

I have seen these new GPU enabled Graviton instances can be 40% cheaper to run. However, they run on ARM (not AMD). Does this mean I can run the standard version of Triton server on this instance?

Looking at Triton server release notes, I have seen it can run on jetson nano, which is nvidia gpu ARM https://github.com/triton-inference-server/server/releases/tag/v1.12.0

Does this method reduce my costs? Can I run Triton server on these graviton instances?

Does performance drop using these instances?

Upvotes: 1

Views: 348

Answers (1)

Geoffrey Blake
Geoffrey Blake

Reputation: 179

Looking at Nvidia's NGC container repository there are containers built for Arm64 for the most recent version. On the surface it appears it should work on G5g. I would recommend trying the container and testing if it suits your needs. Without testing your specific workload, it is impossible to know up front what the performance would be and by extension if its cheaper.

Upvotes: 0

Related Questions