J.J
J.J

Reputation: 3607

Parallel compression algorithms

Many/most compression algorithms have a parallel-decompression implementation (like pigz for gzip, etc).

However, rarely does one see a reduction in time proportional to the number of processors thrown at the task, with most not benefiting at all from more than 6 processors.

I'm curious to know if there are any compression formats with parallel decompression built into the design - i.e. would be theoretically 100x faster with 100 cpus than with 1.

Thank you and all the best :)

Upvotes: 1

Views: 1855

Answers (1)

Mark Adler
Mark Adler

Reputation: 112384

You're probably I/O bound. At some point more processors won't help if they're waiting for input or output. You just get more processors waiting.

Or maybe your input files aren't big enough.

pigz will in fact be 100x faster with 100 cpus, for a sufficiently large input, if it is not I/O bound. By default, pigz sends 128K blocks to each processor to work on, so you would need the input to be at least 13 MB in order to provide work for all 100 processors. Ideally a good bit more than that to get all the processors running at full steam at the same time.

Upvotes: 2

Related Questions