Reputation: 123
My Beam pipeline seems to work properly, but produces a lot of warnings:
[direct-runner-worker] WARN org.apache.beam.sdk.util.MutationDetectors - Coder of type class org.apache.beam.sdk.coders.SerializableCoder has a #structuralValue method which does not return true when the encoding of the elements is equal. Element com.emarsys.dataflow.templates.MetricBundle@73d85187
The class looks like this:
@AllArgsConstructor
@Data
public class MetricBundle implements Serializable {
private Long customerId;
private String query;
private Map<PrimaryKey, Metric> metrics;
}
The problem is with the Map
. PrimaryKey
and Metric
classes are composed of simple types, and implement the Serializable
interface.
The transform producing the PCollection<MetricBundle>
is using .setCoder(SerializableCoder.of(MetricBundle.class));
Do I have to implement a custom coder in order to serialize Map
properly?
During testing, I found no conflicts in the result, all elements are handled properly.
Upvotes: 1
Views: 1884
Reputation: 5104
This is warning that due to the use of SerializableCoder it's not able to check certain properties of your pipeline. (SerializableCoder is also very inefficient.) I would recommend using a more specific coder here, but you don't have to necessarily write one of your own. E.g. it looks like you could decorate your class with @DefaultSchema(JavaFieldSchema.class)
. See https://beam.apache.org/documentation/programming-guide/#schemas
Upvotes: 2