Jerome tan
Jerome tan

Reputation: 155

How to use custom parquet compression algorithm?

Is it possible to use a customized compression algorithm in Spark to read and write on Parquet files?

Ideally, it would be configured as follows:

sqlContext.setConf("spark.sql.parquet.compression.codec", "myalgo")

Upvotes: 3

Views: 1780

Answers (1)

stefanobaghino
stefanobaghino

Reputation: 12794

No, as stated in the documentation (here referring to version 2.2.0) the only acceptable values are

  • uncompressed,
  • snappy,
  • gzip and
  • lzo

with snappy being the default one.

This is due to a limitation of Parquet itself, which only uses a restricted set of compression algorithms, as listed in this enumeration (valid for version 1.5.0).

Upvotes: 4

Related Questions