Graham Polley
Graham Polley

Reputation: 14791

Jobs broken with SDK version 0.4.150414

Pulled the latest SDK version (0.4.150414) from Maven, and our jobs are now failing.

We've traced it down to something with the deserialisation of a HashMap that is used in one of our classes, and which is referenced by the ParDo transformation.

Observations:

Did anything change with the serialization/deserialization functionality in the latest version of the SDK?

Happy to send our code to the feedback email if you need it.

Upvotes: 3

Views: 97

Answers (1)

Ben Chambers
Ben Chambers

Reputation: 6130

A change was made in the latest version to clone the DoFn when passed to a ParDo.of. This leads to better behavior if the DoFn is used multiple times, and modified in between uses.

The problem you describe would happen if the HashMap field was populated after the DoFn was passed to ParDo.of.

You can confirm this by setting a break point at ParDo.of and inspecting the state of the DoFn there. To fix this, initialize the field before invoking ParDo.of.

Hope this helps!

Upvotes: 5

Related Questions