Reputation: 141
I had a couple questions regarding the Cassandra connector written by Data Mountaineer. Any help is greatly appreciated as we're trying to figure out the best way to scale our architecture.
Do we have to create a Connector config for each Cassandra table we want to update? For instance, let's say I have a 1000 tables. Each table is dedicated to a different type of widget. Each widget has similar characteristics, but slightly different data. Do we need to create a connector for each table? If so, how is this managed and how does this scale?
In Cassandra, we often need to model column families based on the business need. We may have 3 tables representing user information. 1 by username, 1 by email and 1 by last name. Would we need 3 connector configs and deploy 3 separate Sink tasks to push data to each table?
Upvotes: 1
Views: 984
Reputation: 26
I think both questions are similar, can the sink handle multiple topics?
The sink can handle multiple tables in one sink so one configuration. This is set in the kcql statement connect.cassandra.export.route.query=INSERT INTO orders SELECT * FROM orders-topic;INSERT INTO positions SELECT * FROM positions
but at present they need to be in the same Cassandra keyspace. This would route events from the trades topic to a Cassandra table called trades and events from positions. You can also select specific columns and rename like select columnA as columnB.
You may want more than one sink instance for separation of concerns, i.e. isolating the write of a group of topics from other unrelated topics.
You can scale with the number of tasks the connector is allowed to run, each task starts a Writer for all the target tables.
We have a support channel of our own for more direct communication. https://datamountaineer.com/contact/
Upvotes: 1