S.V
S.V

Reputation: 2793

blosc vs blosc2: parallel compression

Could you, please, help me to understand if the blosc library supports parallel compression?

My understand is that parallel compression is supported by blosc2, but I am having difficulty finding a definitive answer on whether it is supported by the first version of blosc.

Setting BLOSC_NTHREADS environment variable to 10 does not seem to have any effect on the CPU usage during blosc data compression in pandas when data is written to an HDF5 file.

Upvotes: 0

Views: 99

Answers (1)

Francesc
Francesc

Reputation: 376

Both blosc and blosc2 have support for parallel compression. However, for leveraging the full potential of acceleration, you need to choose appropriate chunksizes (and blocksizes for blosc2).

For example, latest python-blosc2 3.x series allow you to automatically select appropriate chunk/block sizes and number of threads to accelerate your data access/store specifically for your machine. For more info on the class of performance this can lead, see: https://www.blosc.org/python-blosc2/getting_started/overview.html#operating-with-ndarrays

Upvotes: 1

Related Questions