Raman
Raman

Reputation: 717

Kafka Connect S3 - JSON to Parquet

Does Kafka Connect S3 support from JSON to Parquet? Appreciate available and alternative suggestions using Kafka Connect S3

Upvotes: 2

Views: 2016

Answers (1)

Robin Moffatt
Robin Moffatt

Reputation: 32090

Does Kafka Connect S3 support from JSON to Parquet?

No, it does not. Per the docs page:

You must use the AvroConverter with ParquetFormat in the S3 Sink connector. Attempting to use the JsonConverter (with or without schemas) will result in a runtime exception.

One option you have would be use to ksqlDB to reserialise your data into Avro first, e.g.:

CREATE STREAM source (COL1 VARCHAR, COL2 INT, COL3 BIGINT) WITH (VALUE_FORMAT='JSON', KAFKA_TOPIC='my_source_topic');

CREATE STREAM target WITH (KAFKA_TOPIC='my_target_topic', VALUE_FORMAT='AVRO') AS SELECT * FROM source;

With that done you then sink my_target_topic to S3 using the Parquet formatted (you can even do this from ksqlDB with CREATE SINK CONNECTOR…)

Upvotes: 1

Related Questions