Reputation: 2639
I have a general question regarding the DoFn. according to this doc:
If required, a fresh instance of the argument DoFn is created on a worker, and the DoFn.Setup method is called on this instance. This may be through deserialization or other means. A PipelineRunner may reuse DoFn instances for multiple bundles. A DoFn that has terminated abnormally (by throwing an Exception) will never be reused.
Upvotes: 2
Views: 872
Reputation: 5104
If a DoFn throws an exception, that will abort the bundle and work on that element will be aborted. Generally, with a production-level runner, this work will be re-tried a certain number of times (e.g. for Dataflow batch pipelines, up to 4 times, for Dataflow streaming indefinitely) before the runner gives up and fails the full pipeline.
In no instance is the message silently dropped.
See https://beam.apache.org/documentation/runtime/model/ for a more detailed explanation of what the model requires here.
Upvotes: 2