Reputation: 122
I have a stream of data coming in from mysql table to kafka to my spark program.When a new row is inserted, I do transformations on the stream and save to cassandra.
My problem is when a row is updated, I would like to union the transformations I made previously when the row was created and the new update. I understand that I have the option of using stateful streaming and database connectors, can someone explain any other options I have when I need to perform an external look up.
Upvotes: 2
Views: 466
Reputation: 5636
I assume you're asking how to handle data mutations in Spark Streaming in addition to structured streaming?
For external look-ups, there are a wide variety of datastores available to use in conjunction with Spark. I created a sort of master list here awhile ago. As far as I know, SnappyData is the only one that allows you to perform data mutations in a DataFrame itself.
Disclaimer: I work for SnappyData
Upvotes: 2