Druid how to drop duplicates in Kafka indexing service

Question

I am using DRUID with Kafka Indexing service. I am trying to understanding how it handles duplicate messages.

Example

Consider I have following message in Kafka Topic[1 partition only]

[Offset=100]

{
  "ID":4,
  "POINTS":1005,
  "CREATED_AT":1616258354000000,
  "UPDATED_AT":1616304119000000
}

Now consider after 24 hours, somehow same message is pushed again to topic.

[Offset=101]

{
  "ID":4,
  "POINTS":1005,
  "CREATED_AT":1616258354000000,
  "UPDATED_AT":1616304119000000
}

Note: Payload has not changed.

Actual:Now, In DRUID I see the same message again.

Expected: What I expect is since the payload has not changed the message should be ignored.

My timestamp column is CREATED_AT

Answers (1)