Reputation: 351
I'm a new user of pyspark from Apache Zeppelin 0.7.1 to access my Spark cluster. I configured 2 machines:
Situation:
Cluster works fine if I use pyspark console from the Master (Machine-1).
When I use Local[*] configuration of Spark, all it's OK from
Zeppelin.
Following this zeppelin documentation, I put spark://Machine-1:7077 at the master property of the spark interpreter configuration. Then, some code runs OK from the cells of my Zeppelin Notebook:
%spark
sc.version
sc.getConf.get("spark.home")
System.getenv().get("PYTHONPATH")
System.getenv().get("SPARK_HOME")
but others RDD trasnformations (for instance) never end:
%pyspark
input_file = "/tmp/kddcup.data_10_percent.gz"
raw_rdd = sc.textFile(input_file)
What's wrong? Some advice? Thank you in adance.
Upvotes: 0
Views: 804
Reputation: 351
eventually I realised that:
Thank you, Greg, for your interest.
Upvotes: 0