shashankS
shashankS

Reputation: 1073

Does all of three: Presto, hive and impala support Avro data format?

I am clear about the Serde available in Hive to support Avro schema for data formats. Comfortable in using avro with hive.

AvroSerDe

for say, I have found this issue against presto. https://github.com/prestodb/presto/issues/5009

I need to choose components for fast execution cycle. Presto and impala provide much smaller execution cycle. So, Anyone please let me clarify that which would be better in different data formats. Primarily, I am looking for avro support with Presto now.

However, lets consider following data formats stored on HDFS:

  1. Avro format
  2. Parquet format
  3. Orc format

Which is the best to use with high performance on different data formats. ?? please suggest.

Upvotes: 0

Views: 1825

Answers (1)

Zoltan
Zoltan

Reputation: 3105

  • Impala can read Avro data but can not write it. Please refer to this documentaion page describing the file formats supported by Impala.

  • Hive supports both reading and writing Avro files.

  • Presto's Hive Connector supports Avro as well. Thanks to David Phillips for pointing out this documentaion page.

There are different benchmarks on the internet about performance, but I would not like to link to a specific one as results heavily depend on the exact use case benchmarked.

Upvotes: 1

Related Questions