Federico
Federico

Reputation: 433

How to create and populate Parquet files in HDFS using Java?

What is the best way to create and populate Parquet files in HDFS using Java without the support of Hive or Impala libraries?

My goal is to write a simple csv record (String) to a Parquet file located in HDFS.

All the questions/answers previously asked are confusing.

Upvotes: 0

Views: 905

Answers (1)

ldz
ldz

Reputation: 2215

Seems like parquet-mr is the way to go. They provide implementations for Thrift and Avro. Own implementations should be based on ParquetOutputFormat and might look similar to AvroParquetOutputFormat and AvroWriteSupport which does the actual conversion.

Upvotes: 1

Related Questions