Reco Jhonatan
Reco Jhonatan

Reputation: 1623

Joining streaming data with bounded data using apache beam

I'm trying to understand how works a simple enrichment data process using apache beam.

I've designed a first dummy-diagram but I'm not sure how address this:

enter image description here

I've saw some examples using CoGroupByKey or using lambda but I'm not sure and I feel a little lost on this.

I'm rigth with the approach? Where could I find some examples to understand better?

Thanks a lot!!

Upvotes: 2

Views: 756

Answers (1)

Jose Gutierrez Paliza
Jose Gutierrez Paliza

Reputation: 1428

It depends on what you are trying to do. If your unbound data and your streaming data have a value in common, I would use CoGroupByKey. But this does not always work due to the streamed data. If so, you will need to use side inputs, and then you can use the lambda expression or GroupByKey to merge the data. You can look at this example of CoGroupByKey. This is an example of lambda, and this documentation is really good explaining the functions that you can use with Apache Beam through Python.

Upvotes: 1

Related Questions