Reputation: 6048
I am trying to figure out how to fetch all data from a query initially, then incrementally only changes using kafka connector. The reason for this is i want to load all data into elastic search, then keep es in sync with my kafka streams. Currently, i am doing this by first using connector with mode = bulk, then i change it to timestamp. This works fine.
However, if we ever want to reload all data to the Streams and ES, it means we have to write some scripts that somehow cleans or deletes kafka streams and es indices data, modify the connect ini's to set mode as bulk, restart everything, give it time to load all that data, then modify scripts again to timestamp mode, then restart everything once more(reason for needing such a script is that occasionally, bulk updates happen to correct historic data through an etl process we do not yet have control over, and this process does not update timestamps)
Is anyone doing something similar and have found a more elegant solution?
Upvotes: 7
Views: 4175
Reputation: 6048
coming back to this after a long time. The way was able to solve this and never have to use bulk mode
Upvotes: 1
Reputation: 452
how to fetch all data from a query initially, then incrementally only changes using kafka connector.
Maybe this may help you. For example, I have a table:
╔════╦═════════════╦═══════════╗
║ Id ║ Name ║ Surname ║
╠════╬═════════════╬═══════════╣
║ 1 ║ Martin ║ Scorsese ║
║ 2 ║ Steven ║ Spielberg ║
║ 3 ║ Christopher ║ Nolan ║
╚════╩═════════════╩═══════════╝
In this case I will create a View:
CREATE OR REPLACE VIEW EDGE_DIRECTORS AS
SELECT 0 AS EXID, ID, NAME, SURNAME
FROM DIRECTORS WHERE ID =< 2
UNION ALL
SELECT ID AS EXID, ID, NAME, SURNAME
FROM DIRECTORS WHERE ID > 2;
In the properties file for kafka jdbc connector you may use:
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
mode=incrementing
incrementing.column.name=EXID
topic.prefix=
tasks.max=1
name=gv-jdbc-source-connector
connection.url=
table.types=VIEW
table.whitelist=EDGE_DIRECTORS
So kafka jdbc connector will take steps:
Select EXID, ID, NAME, SURNAME FROM EDGE_DIRECTORS
and will
notice that EXID had been incremented. Upvotes: 0