Daniel Darabos
Daniel Darabos

Reputation: 27470

Kryo registration issue when upgrading to Spark 2.0

I am upgrading an application from Spark 1.6.2 to Spark 2.0.2. The issue is not strictly Spark-related. Spark 1.6.2 includes Kryo 2.21. Spark 2.0.2 includes Kryo 3.0.3.

The application stores some data serialized with Kryo on HDFS. To save space, Kryo registration is enforced. When a class is registered with Kryo, it gets a sequential ID and this ID is used to represent the class in the wire format instead of the full class name. When we register a new class, we always put it at the end, so it gets an unused ID. We also never delete a class from registration. (If a class is deleted, we register a placeholder in its place to reserve the ID.) This way the IDs are stable and one version of the application can read the data written by a previous version.

It turns out Kryo uses the same registration mechanism to register primitive classes in its constructor. In Kryo 2.21 it registers 9 primitive classes, so the first user-registered class gets ID 9. But Kryo 2.22 and later register 10 primitive classes. (void was added.) This means the user-registered classes start from ID 10.

How can we still load the old data after upgrading to Spark 2.0.2?

(It would be great if our first user-registered class were a deprecated class. But it is not. It is scala.Tuple2[_, _].)

Upvotes: 1

Views: 164

Answers (1)

Daniel Darabos
Daniel Darabos

Reputation: 27470

There is actually a Kryo.register(Class type, int id) method that can be used to explicitly specify an ID. The comment for the id parameter says:

id: Must be >= 0. Smaller IDs are serialized more efficiently. IDs 0-8 are used by default for primitive types and String, but these IDs can be repurposed.

The comment is wrong since 2.22: ID 9 is now also used by default. But indeed it can be repurposed!

kryo.register(classOf[Tuple2[_, _]], 9)

The normal sequential registration works for the rest of the classes. The explicit ID is only necessary for the first class.

Upvotes: 1

Related Questions