Reputation: 2316
I have a virtual machine which has Spark 1.3
on it but I want to upgrade it to Spark 1.5
primarily due certain supported functionalities which were not in 1.3. Is it possible I can upgrade the Spark
version from 1.3
to 1.5
and if yes then how can I do that?
Upvotes: 15
Views: 39175
Reputation: 69
Below are the step by step instruction on how to upgrade Apache Spark to V3.4.
Step 1:
Go to AzSynapseSparkPool Powershell from the Azure Portal
Step: 2:
Upgrade Apache Spark pool using Update-AzSynapseSparkPool powershell cmdlet as shown below.
Check the version of the Apache Spark:
get-AzSynapsesparkpool -WorkspaceName <Synapseworkspacename>
Update the version of the Spark:
update-AzSynapseSparkPool -WorkspaceName <Synapseworkspacename> -Name <SparkPoolName> -sparkversion 3.4
Upvotes: 1
Reputation: 19282
SPARK_HOME
to /opt/spark
spark-2.2.1-bin-hadoop2.7.tgz
- can use wget
ln -s /opt/spark-2.2.1 /opt/spark
$SPARK_HOME/conf
accordinglyFor every new version you download just create the symlink to it (step 3)
ln -s /opt/spark-x.x.x /opt/spark
Upvotes: 3
Reputation: 60390
Pre-built Spark distributions, like the one I believe you are using based on another question of yours, are rather straightforward to "upgrade", since Spark is not actually "installed". Actually, all you have to do is:
spark-1.3.1-bin-hadoop2.6
already is)SPARK_HOME
(and possibly some other environment variables depending on your setup) accordinglyHere is what I just did myself, to go from 1.3.1 to 1.5.2, in a setting similar to yours (vagrant VM running Ubuntu):
Download the tar file in the appropriate directory
vagrant@sparkvm2:~$ cd $SPARK_HOME vagrant@sparkvm2:/usr/local/bin/spark-1.3.1-bin-hadoop2.6$ cd .. vagrant@sparkvm2:/usr/local/bin$ ls ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6 ipcluster2 ipengine iptest2 jsonschema ipcontroller ipengine2 ipython pygmentize vagrant@sparkvm2:/usr/local/bin$ sudo wget http://apache.tsl.gr/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz [...] vagrant@sparkvm2:/usr/local/bin$ ls ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6 ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6.tgz ipcontroller ipengine2 ipython pygmentize
Notice that the exact mirror you should use with wget
will be probably different than mine, depending on your location; you will get this by clicking the "Download Spark" link in the download page, after you have selected the package type to download.
Unpack the tgz
file with
vagrant@sparkvm2:/usr/local/bin$ sudo tar -xzf spark-1.*.tgz vagrant@sparkvm2:/usr/local/bin$ ls ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6 ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6 ipcontroller ipengine2 ipython pygmentize spark-1.5.2-bin-hadoop2.6.tgz
You can see that now you have a new folder, spark-1.5.2-bin-hadoop2.6
.
SPARK_HOME
(and possibly other environment variables you are using) to point to this new directory instead of the previous one.And you should be done, after restarting your machine.
Notice that:
sudo
was necessary in my case; it may be unnecessary for you depending on your settings.tgz
file (see below why).tgz
files have been deleted, or modify the tar
command above to point to a specific file (i.e. no *
wildcards as above).Upvotes: 23