Raghav salotra
Raghav salotra

Reputation: 870

Syncing Kafka with aws s3 with different directory structure

We have events coming to Kafka and using kafka connect we are syncing these events with aws s3. Data is visible in s3 in below dir structure:

bucket_name/sub_folder/
                       Partition=0/events.json
                       Partition=1/events.json
                       Partition=2/events.json

is there a way to store in below dir structure:

Bucket_name/sub_folder/date=today_date/ events.json or Partition=0..2/date=today/events.json
Bucket_name/sub_folder/date=today_date/ events.json or 

Motivation is to store that days events in that that days directory, i searched web but could not find any other way . Thanks in advance.

Upvotes: 0

Views: 266

Answers (1)

Robin Moffatt
Robin Moffatt

Reputation: 32090

You can use the TimeBasedPartitioner which

partitions data according to ingestion time.

e.g. for hourly partioning:

[…]
"partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
"path.format": "'year'=YYYY/'month'=MM/'day'=dd/'hour'=HH",
"locale": "US",
"timezone": "UTC",
"partition.duration.ms": "3600000",
"timestamp.extractor": "RecordField",
"timestamp.field": "my_record_field_with_timestamp_in",
[…]

Upvotes: 2

Related Questions