Reputation: 138
I have written a spark job to read from kafka topic, do some processing and dump the data in avro format to GCS.
I am deploying this JAVA application dataproc serverless using the TriggerOnce mode so that at every run the new data pushed to kafka topic is consumed and dumped to GCS.
The strange behaviour is that on the first run the code works absolutely fine but when I try to rerun for the next batch I get the below error.
java.io.InvalidClassException: org.apache.spark.sql.avro.AvroDataToCatalyst; local class incompatible: stream classdesc serialVersionUID = -4108983435828400550, local class serialVersionUID = 3066013574753296163
Upvotes: 1
Views: 31