Reputation: 11
I am working on how to offload some workload to GPU using CUDA in SpringBoot project. To help me explain my question better, let me suppose that we want to implement a REST API to do matrix-vector multiplication in SpringBoot application. We need to load some matrices with various sizes to GPU's memory on application launch, then accept user's request with vector data and find the corresponding matrix inside GPU to do matrix-vector multiplication, and finally return the multiplication result to user. We have already implemented the kernel using JCuda.
In this scenario, we want to process users' requests concurrently, so there are several questions I am interested in:
Upvotes: 1
Views: 488