Kafka JDBC connector load all data, then incremental

Question

I am trying to figure out how to fetch all data from a query initially, then incrementally only changes using kafka connector. The reason for this is i want to load all data into elastic search, then keep es in sync with my kafka streams. Currently, i am doing this by first using connector with mode = bulk, then i change it to timestamp. This works fine.

However, if we ever want to reload all data to the Streams and ES, it means we have to write some scripts that somehow cleans or deletes kafka streams and es indices data, modify the connect ini's to set mode as bulk, restart everything, give it time to load all that data, then modify scripts again to timestamp mode, then restart everything once more(reason for needing such a script is that occasionally, bulk updates happen to correct historic data through an etl process we do not yet have control over, and this process does not update timestamps)

Is anyone doing something similar and have found a more elegant solution?

mike01010 · Accepted Answer

coming back to this after a long time. The way was able to solve this and never have to use bulk mode

stop connectors
wipe offset files for each connector jvm
(optional) if you want to do a complete wipe and load, you want to probably also delete your topics use the kafka/connect utils/rest api (and dont forget the state topics)
restart connects.

Kafka JDBC connector load all data, then incremental

Answers (2)

Related Questions