Apache Zeppelin

Apache Zeppelin is a web-based notebook platform that enables interactive data analytics with interactive data visualizations and notebook sharing. You can make data-driven, interactive and collaborative documents with SQL, Scala, Python, R in a single notebook document.

It shares a single SparkContext between languages (so you can pass data between Scala and Python easily).

This is an excellent tool for prototyping Scala/Spark code with SQL queries to analyze data (by data visualizations) that could be used by non-Scala developers like data analysts using SQL and Python.

Note
Zeppelin aims at more analytics and business people (not necessarily for Spark/Scala developers for whom Spark Notebook may appear a better fit).

Clients talk to the Zeppelin Server using HTTP REST or Websocket endpoints.

Available basic and advanced display systems:

  • text (default)

  • HTML

  • table

  • Angular

Features

  1. Apache License 2.0 licensed

  2. Interactive

  3. Web-Based

  4. Data Visualization (charts)

  5. Collaboration by Sharing Notebooks and Paragraphs

  6. Multiple Language and Data Processing Backends called Interpreters, including the built-in Apache Spark integration, Apache Flink, Apache Hive, Apache Cassandra, Apache Tajo, Apache Phoenix, Apache Ignite, Apache Geode

  7. Display Systems

  8. Built-in Scheduler to run notebooks with cron expression

Further reading or watching

  1. (video) Data Science Lifecycle with Zeppelin and Spark from Spark Summit Europe 2015 with the creator of the Apache Zeppelin project — Moon soo Lee.

results matching ""

    No results matching ""