marvi
marvi

Reputation: 186

Installing Spark 2 on CDH 5.* with RPM?

I have a Cloudera CDH 5.11 cluster installed from RPM packages (we don't want to use Cloudera Manager or parcels). Has anyone found/built Spark 2 RPM packages for CDH? It seems Cloudera only ships Spark 2 as parcels.

Upvotes: 1

Views: 572

Answers (4)

marvi
marvi

Reputation: 186

From CDH 6.0 Spark 2 is included as RPMs. Problem solved.

Upvotes: 0

marvi
marvi

Reputation: 186

The best way is to use Spark on Yarn instead of using Spark Master/Worker. You are free to use any Spark version you like, independent of what the vendor ships.

What you need to do is to package Spark History Server to be able to look at jobs after they finishes. And, if you want to use Dynamic Allocation, you need Spark Shuffle Service configured in Yarn.

Upvotes: 1

BurritoBoy
BurritoBoy

Reputation: 45

Looks like I can't comment on an issue so excuse this post as an answer.

Is it possible to install the Spark2 parcel on a RPM installed cluster using CM?

Upvotes: 0

tk421
tk421

Reputation: 5947

You won't. For now, the doc "Spark 2 Known Issues" clearly states:

Package Install is not Supported

The Cloudera Distribution of Apache Spark 2 is only installable as a parcel.

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_known_issues.html#ki_package_install

Upvotes: 1

Related Questions