Reputation: 176
I was recently trying to setup Spark Notebook in Hue UI. I am running Cloudera CDH 5.8 in VirtualBox. Spark notebook works on Livy Server and I installed livy server. I also remove spark from the blacklist from Hue.ini file. But still, I do not get the Spark Notebook in Hue UI.
Update: Now I can access notebook. However, I can not submit spark jobs to cluster. I have tried several scripts only Impala, Hive scripts works but R, Pyspark or Scala scripts are not working. I get following errors.
Can somebody help me to figure the problem? I can provide more information if needed.
Thank you.
.....Thanks to Romainr, I could have managed to run Spark Notebook in Hue. Now I am facing some issue to submit jobs to Apache spark which is running in Cloudera manager on the same localhost. Errors are exposed in following screenshots. Any help will be much appreciated. Thank you.
Error: Spark session could not be created in cluster: timeout
"Session '-1' not found." (error 404)
Upvotes: 0
Views: 3247
Reputation: 176
If you run pySpark notebook from Hue, it says timeout as it can not access the resources. In fact, if you try to run the command pyspark or scala from command line interface you will see some errors.
When you get the timeout error from Hue Notebook then look into the log and you will find permission denied issues. So in order to give access do following: (Run on Linux shell)
$ sudo -u hdfs hadoop fs -chmod 777 /user/spark
$ sudo -u spark hadoop fs -chmod 777 /user/spark/applicationHistory
After this if you try to restart hue and spark service in CDH and create pyspark or scala notebook from hue, it should run out of the box. If you still get errors, let me know.
Upvotes: 0