Reputation: 493
Is there a way to force Amazon EMR to use Spark 1.0.1? The current selectable versions stop at 1.4.1.
I am using the Alternating Least Squares implementation within MLlib, and since v1.1 they have implemented weighted regularization and for specific reasons (research study) I do not want this implementation, rather I am trying to access the non-weighted regularization version they had implemented in v1.0.
I am using Zepplin notebooks with Scala if that helps.
Upvotes: 0
Views: 283
Reputation: 421
Is working with Zeppelin a requirement? Because if so, it could be very difficult. Zeppelin is compiled against a specific version of Spark so downgrading the jar will most likely fail.
Otherwise, if you are ok with not using Zeppelin and instead using the EMR Step API, then you might be able to spin up an EMR cluster with a bootstrap action that installs spark-assembly 1.0.1. I said it might work, because there's no guarantee that the current EMR version is compatible with a 2 year old version of Spark.
To create the cluster:
s3://support.elasticmapreduce/spark/install-spark
with arguments -v 1.0.1
(See https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark for configuration options)To run spark using the EMR Step API:
aws emr add-steps --cluster-id --steps \ Name=SparkPi,Jar=s3://.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--deploy-mode,cluster,--master,yarn,--class,com.your.spark.class.MainApp,s3://>/your.jar,arg1,arg2],ActionOnFailure=CONTINUE
(Taken from the official github repo at https://github.com/awslabs/emr-bootstrap-actions/blob/master/spark/examples/spark-submit-via-step.md)
Also if that fails, install Hadoop and check out https://spark.apache.org/docs/1.0.1/running-on-yarn.html
Or you could also run 1.0.1 locally on your laptop if your data is small.
Good luck.
Upvotes: 1
Reputation: 94
EMR supports Spark 1.6.0. Take a look at their latest release of emr-4.4.0: http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-whatsnew.html
Upvotes: 0
Reputation: 1508
Amazon EMR provide a list of supported versions of software packages you can install by selecting a drop menu. Nothing stop you from installing additional custom software with a bootstrap action. I had some experience installing java 8 when EMR was supporting only Java 7. It is a bit painful but totally possible.
Upvotes: 0