Reputation: 261
I have a Dataproc cluster v-1.2 which currently has Spark version 2.2.0, but our program currently fails and the fix has been introduced in Spark version 2.2.1 and 2.3.0. Is there a way in which we can upgrade Spark version without impacting or breaking any of the dependencies in the current cluster.
Upvotes: 3
Views: 4431
Reputation: 512
You can upgrade spark to the newer version 2.3 but there are some inbuilt functionalities you cannot use after the upgrade like you cannot directly open file from Google Cloud Storage.
Here is the link you can check the release date of all versions
They released 2.3 version but I haven't checked yet.
I hope they changed the default version. because I want to use pandas_udf in pyspark.
Upvotes: 0
Reputation: 1383
FYI, Spark 2.3 is available in Dataproc 1.3: https://cloud.google.com/dataproc/docs/concepts/versioning/dataproc-versions.
gcloud dataproc clusters create <clustername> --image-version=1.3
Upvotes: 2