Aron ramac
Aron ramac

Reputation: 31

Zeppeline - How to setup Zeppeline to connect to remote sparkmaster?

I have 5 node spark cluster on a separate set of hosts. I installed zeppeline on a separate host and hookup the spark interpreter to execute the queries against the spark cluster.

Zeppeline version 1.6 - Installed on Desktop

I have tried both.

Added the "export MASTER=spark://sparkmasterhost:7077" and set the set the spark interpreter master variable set to "spark://sparkmasterhost:7077"

When I run the "sc.version", I am getting this error.

org.apache.thrift.transport.TTransportException
 at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
 at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 
 at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) 
 at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) 
 at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) 
 at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) 
 at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:220) 
 at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:205) 
 at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:208) 
 at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) 
 at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:211) 
 at org.apache.zeppelin.scheduler.Job.run(Job.java:169) 
 at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322) 
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) 
 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) 
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
 at java.lang.Thread.run(Thread.java:745) 

I am not sure what is going on.

Upvotes: 3

Views: 6780

Answers (1)

avloss
avloss

Reputation: 2626

Very likely the version of spark embedded in your Zeppelin differs from the version of your spark cluster. Open http://<spark-master.url>:8080/ and check version in the top left corner - 1.6.0 for instance. then download Zeppelin Source and build it locally with Spark version flag - zeppelin@<remote-host>:~/incubator-zeppelin$ mvn clean package -DskipTests -Pspark-1.6 -Dspark.version=1.6.0. I've just cloned git https://github.com/apache/incubator-zeppelin. Another very similar question here

Upvotes: 2

Related Questions