Reputation: 233
I am evaluating Spark Notebook and found three different products; 1. Hue 3.9 comes with Spark notebook (beta) 2. Apache zeppelin 3. andypetrella/spark-notebook.
Can you please help me understand pros and cons of each product
Thanks Pani
Upvotes: 1
Views: 4918
Reputation: 1304
Jupyter is a well established project whereas Spark Notebook is a great but individual effort with good fairly recent explanation here from the author himself, and Zeppelin is incubating at Apache, so on that consideration we have the modern version of "no one ever got fired for buying IBM" (until they did haha) and Jupyter is the IBM in the room.
It may help to look over some of the docs on Cloudera, for example http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/ (note Jupyter used to be called iPython Notebook)
If you could post more about your use case it would help people answer your question, and perhaps post what research you have already done, StackOverflow has specific requirements for good questions and a big emphasis is trying something first and posting code. Your question may be a better fit for another StackExchange site.
If you look here you'll get more interesting information, like Zeppelin being more focused on running on top of Hadoop (and Tachyon? which I guess is a transparent layer) and Zeppelin provides a pluggable interface so you can develop with more languages.
Upvotes: 0
Reputation: 48
I have only played with Hue and Jupyter.
Hue is kind of new but offer more than just a Spark Notebook, it integrates with all the Hadoop components (Oozie, Solr, Impala, HBase, Pig...).
Jupyter is great if you want an advanced editor for Pyspark. The Python editor is really good and it is very popular in the Python community.
Upvotes: 2