Reputation: 66
I have seen lot of examples of Apache Beam where you read data from PubSub and write to GCS bucket, however is there any example of using KafkaIO and writing it to GCS bucket? Where I can parse the message and put it in appropriate bucket based on the message content?
For e.g.
message = {type="type_x", some other attributes....}
message = {type="type_y", some other attributes....}
type_x --> goes to bucket x
type_y --> goes to bucket y
My usecase is streaming data from Kafka to GCS bucket, so if someone suggest some better way to do it in GCP its welcome too.
Thanks. Regards, Anant.
Upvotes: 2
Views: 608
Reputation: 356
You can use Secor to load messages to a GCS bucket. Secor is also able to parse incoming messages and puts them under different paths in the same bucket.
Upvotes: 1
Reputation: 2825
You can take a look at the example present here - https://github.com/0x0ece/beam-starter/blob/master/src/main/java/com/dataradiant/beam/examples/StreamWordCount.java
Once you have read the data elements if you want to write to multiple destinations based on a specific data value you can look at multiple outputs using TupleTagList
the details of which can be found here - https://beam.apache.org/documentation/programming-guide/#additional-outputs
Upvotes: 0