Reputation: 1
I am using climate data from the Levante Supercomputer in a JupyterNotebook, but I am having some problems regarding the computational time. Therefore, I was wondering if someone had some knowledge of dask or other parallelisation tools to speed up the computations, since it is taking a considerable amount of time.
Here is an example of a simple operation that I do to get the yearly wind speed means of a certain location:
import intake
import xarray as xr
#data_park is an xarray.DataArray, this takes a lot of time
vals = data_park.groupby('time.year').mean().plot()
I have tried so far to set up a client with 25 workers, 5 threads per worker, and a memory limit of 30GB per worker, but I am unsure whether this is a suitable combination or not.
Thanks in advance! :)
Upvotes: 0
Views: 28