Reputation: 383
I have been working on a small Python package to solve a class of PDEs using scipy.integrate.solve_ivp
. As discretizations are made finer, runtime becomes a bottleneck—especially when I need to solve the PDE for a large number of different initial conditions.
I would like to make use of GPU acceleration to speed things up, but I am unsure of how to integrate GPU-based computations into my current implementation. Here is an example of my implementation on Google Colab. In the notebook, I also tried using CuPy to transfer data to the GPU, perform a forward step, and then transfer back to the CPU, but the transfer overhead was too large.
Would I have to rewrite the solvers in something like CuPy/JAX to make use of GPUs?
Upvotes: -2
Views: 148
Reputation: 3873
Consider adding some example code and focusing your questions to avoid this being closed. In the meantime, there are things I can say.
Question 1:
I've gotten ~60x speedup using a CuPy callable (Tesla P100 GPU on Colab in 2020) with solve_ivp
, despite the overhead of back and forth data transfer.
As you described, at the beginning of the callable, I transfered the state to a CuPy array on the GPU, and after CuPy evaluated the time derivative, I transfered the data back to the NumPy array on the CPU. This is documented in 4.2.2 of this notebook.
Of course, it depends on your application. I was simulating all pairwise interactions between hundreds of "robots". There's not enough information here to say whether this will work well in your application.
Question 2:
This also depends on the method you're using and the dynamics. If you're using an implicit method, it sounds like you could end up with a huge but very sparse system of equations. If you can provide the jac
obian and jac_sparsity
, there may be hope. If you're using an explicit method and the dynamics are not stiff, it also might be OK.
Upvotes: 1