What is apache zeppelin?

Farooque picture Farooque · Jun 8, 2016 · Viewed 10.4k times · Source

As we are hearing often about apache zeppelin, So few questions comes to our mind:

  1. What is Apache zeppelin?
  2. What new and/or extra it is adding to Big data ecosystem?
  3. Is it a replacement of some of the framework(s)/tool(s) already existing in Big data ecosystem?

Answer

Ram Ghadiyaram picture Ram Ghadiyaram · Jun 8, 2016

Short Answer : Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Long answer :

  1. Zeppelin notebook gives you an easy, straightforward way to execute arbitrary code in a web notebook. You can execute Scala, SQL, and even schedule a job (via cron) to run at a regular interval.

  2. First it's easier to mix languages in the same notebook. You can do some SQL, scala, then markdown to document it all together. You can also easily convert your notebook into a presentation style - for maybe presenting to a management or using in dashboards.

  3. The Jupyter (formerly known as IPython) Notebook that has been extremely popular in the Python community. I cant use the word "replace" rather I would use similar kind of...

Further more .

  • Zeppelin supports Spark, PySpark, Spark R, Spark SQL with dependency loader.

  • Zeppelin lets you connect any JDBC data sources seamlessly. Postgresql, Mysql, MariaDB, Redshift, Apache Hive and so on.

  • Python is supported with Matplotlib, Conda, Pandas SQL and PySpark integrations.