JoseD
JoseD

Reputation: 11

is there a way to inject and a config into a ParDo without sideInput?

I have a ParDo that uses state and timers with a periodically updating PcollectionView as sideInput to that parDo; google dataflow will throw an exception that timers are not allowed in such a case. Is there another way to feed config data to the parDo with out sideInput? Essentially, the sideInput was a map of config data that was reading from datastore about every 24 hours.

I am currently trying to see if I can create a ParDo before the one with state and timers to periodically update the config, but I don't see how we can access that map from within the next ParDo. Any suggestions?

Note: This pipeline is running in streaming mode with a global window and reading from pubsub messages as they arrive. Datastore is used to hold data needed to decide when to output an element to a pubsub topic.

Upvotes: 1

Views: 104

Answers (1)

Ryan M
Ryan M

Reputation: 342

Instead of using state timers to update the side input, you can use a fixed window to periodically update your PCollectionView with your data source:

        PCollectionView<Map<String,String>> sideInput = pipeline
                .apply(notifications)
                .apply(
                        Window.<Long>into(FixedWindows.of(Duration.standardMinutes(refreshMinutes)))
                                .triggering(
                                        Repeatedly.forever(AfterPane.elementCountAtLeast(1))
                                )
                                .withAllowedLateness(Duration.ZERO)
                                .discardingFiredPanes()
                )
                .apply( /* query data source */ )
                .apply(View.<Map<String,String>>asSingleton());

Upvotes: 0

Related Questions