Documentation for distributed deployment and schedulers (#747)

* added initial scheduler documentation * Documentation for Deployment Overview and configuration. Also modified Aurora Cluster, Mesos Cluster, Local Cluster and Slurm Cluster * incorporated feedbacks on English corrections * empty new line at the end * incorporate feedbacks on English * incorporate additional feedbacks * removed setup from toc * fixed feedbacks * change mdash to use --- * reworded text and removed the link
apache · May 25, 2016 · be87b09 · be87b09
1 parent 0a27d17
commit be87b09
Show file tree

Hide file tree

Showing 19 changed files with 414 additions and 310 deletions.
diff --git a/heron/uploaders/src/java/com/twitter/heron/uploader/hdfs/sample.yaml b/heron/uploaders/src/java/com/twitter/heron/uploader/hdfs/sample.yaml
@@ -1,5 +1,6 @@
-# Directory of Config files for hadoop client to read from
+# Directory of config files for local hadoop client to read from
 heron.uploader.hdfs.config.directory:              "/home/hadoop/hadoop/conf/"
 
 # The URI of the directory for uploading topologies in the hdfs uploader
-heron.uploader.hdfs.topologies.directory.uri:      "hdfs:///heron/topology/"
+heron.uploader.hdfs.topologies.directory.uri:      "hdfs:///heron/topology/"
+
diff --git a/website/content/docs/concepts/architecture.md b/website/content/docs/concepts/architecture.md
@@ -37,26 +37,26 @@ Stream Processing at Scale](http://dl.acm.org/citation.cfm?id=2742788) paper.
 
 ## Heron Design Goals
 
-* **Isolation** &mdash; [Topologies](../topologies) should be process based
+* **Isolation** --- [Topologies](../topologies) should be process based
   rather than thread based, and each process should run in isolation for the
   sake of easy debugging, profiling, and troubleshooting.
-* **Resource constraints** &mdash; Topologies should use only those resources
+* **Resource constraints** --- Topologies should use only those resources
   that they are initially allocated and never exceed those bounds. This makes
   Heron safe to run in shared infrastructure.
-* **Compatibility** &mdash; Heron is fully API and data model compatible with
+* **Compatibility** --- Heron is fully API and data model compatible with
   [Apache Storm](http://storm.apache.org), making it easy for developers
   to transition between systems.
-* **Back pressure** &mdash; In a distributed system like Heron, there are no
+* **Back pressure** --- In a distributed system like Heron, there are no
   guarantees that all system components will execute at the same speed. Heron
   has built-in [back pressure mechanisms]({{< ref "#stream-manager" >}}) to ensure that
   topologies can self-adjust in case components lag.
-* **Performance** &mdash; Many of Heron's design choices have enabled Heron to
+* **Performance** --- Many of Heron's design choices have enabled Heron to
   achieve higher throughput and lower latency than Storm while also offering
   enhanced configurability to fine-tune potential latency/throughput trade-offs.
-* **Semantic guarantees** &mdash; Heron provides support for both
+* **Semantic guarantees** --- Heron provides support for both
   [at-most-once and at-least-once](https://kafka.apache.org/08/design.html#semantics)
   processing semantics.
-* **Efficiency** &mdash; Heron was built with the goal of achieving all of the
+* **Efficiency** --- Heron was built with the goal of achieving all of the
   above with the minimal possible resource usage.
 
 ## Topology Components

diff --git a/website/content/docs/contributors/codebase.md b/website/content/docs/contributors/codebase.md
@@ -31,16 +31,16 @@ Tracker](../../operators/heron-tracker).
 
 ## Main Tools
 
-* **Build tool** &mdash; Heron uses [Bazel](http://bazel.io/) as its build tool.
+* **Build tool** --- Heron uses [Bazel](http://bazel.io/) as its build tool.
 Information on setting up and using Bazel for Heron can be found in [Compiling
 Heron](../../developers/compiling/compiling).
 
-* **Inter-component communication** &mdash; Heron uses [Protocol
+* **Inter-component communication** --- Heron uses [Protocol
 Buffers](https://developers.google.com/protocol-buffers/?hl=en) for
 communication between components. Most `.proto` definition files can be found in
 [`heron/proto`]({{% githubMaster %}}/heron/proto).
 
-* **Cluster coordination** &mdash; Heron relies heavily on ZooKeeper for cluster
+* **Cluster coordination** --- Heron relies heavily on ZooKeeper for cluster
 coordination for distributed deployment, be it for [Mesos/Aurora](../../operators/deployment/schedulers/aurora),
 [Mesos alone](../../operators/deployment/schedulers/mesos), or for a [custom
 scheduler](../custom-scheduler) that you build. More information on ZooKeeper

diff --git a/website/content/docs/contributors/custom-metrics-sink.md b/website/content/docs/contributors/custom-metrics-sink.md
@@ -20,15 +20,15 @@ for a specific topology. The code for these sinks may prove helpful for
 implementing your own.
 
 * [`GraphiteSink`](/api/metrics/com/twitter/heron/metricsmgr/sink/GraphiteSink.html)
-  &mdash; Sends each `MetricsRecord` object to a
+  --- Sends each `MetricsRecord` object to a
   [Graphite](http://graphite.wikidot.com/) instance according to a Graphite
   prefix.
 * [`ScribeSink`](/api/metrics/com/twitter/heron/metricsmgr/sink/ScribeSink.html)
-  &mdash; Sends each `MetricsRecord` object to a
+  --- Sends each `MetricsRecord` object to a
   [Scribe](https://github.com/facebookarchive/scribe) instance according to a
   Scribe category and namespace.
 * [`FileSink`](/api/metrics/com/twitter/heron/metricsmgr/sink/FileSink.html)
-  &mdash; Writes each `MetricsRecord` object to a JSON file at a specified path.
+  --- Writes each `MetricsRecord` object to a JSON file at a specified path.
 
 More on using those sinks in a Heron cluster can be found in [Metrics
 Manager](../../operators/configuration/metrics-manager).
@@ -62,20 +62,20 @@ Each metrics sink must implement the
 [`IMetricsSink`](http://heronproject.github.io/metrics-api/com/twitter/heron/metricsmgr/IMetricsSink)
 interface, which requires you to implement the following methods:
 
-* `void init(Map<String, Object> conf, SinkContext context)` &mdash; Defines the
+* `void init(Map<String, Object> conf, SinkContext context)` --- Defines the
   initialization behavior of the sink. The `conf` map is the configuration that
   is passed to the sink by the `.yaml` configuration file at
   `heron/config/metrics_sink.yaml`; the
   [`SinkContext`](/api/com/twitter/heron/spi/metricsmgr/sink/SinkContext.html)
   object enables you to access values from the sink's runtime context
   (the ID of the metrics manager, the ID of the sink, and the name of the
   topology).
-* `void processRecord(MetricsRecord record)` &mdash; Defines how each
+* `void processRecord(MetricsRecord record)` --- Defines how each
   `MetricsRecord` that passes through the sink is processed.
-* `void flush()` &mdash; Flush any buffered metrics; this function is called at
+* `void flush()` --- Flush any buffered metrics; this function is called at
   the interval specified by the `flush-frequency-ms`. More info can be found in
   the [Stream Manager](../../operators/configuration/stmgr) document.
-* `void close()` &mdash; Closes the stream and releases any system resources
+* `void close()` --- Closes the stream and releases any system resources
   associated with it; if the stream is already closed, invoking `close()` has no
   effect.
 
@@ -144,11 +144,11 @@ sinks:
 
 For each sink you need to specify the following:
 
-* `class` &mdash; The Java class name of your custom implementation of the
+* `class` --- The Java class name of your custom implementation of the
   `IMetricsSink` interface, e.g. `biz.acme.heron.metrics.PrintSink`.
-* `flush-frequency-ms` &mdash; The frequency (in milliseconds) at which the
+* `flush-frequency-ms` --- The frequency (in milliseconds) at which the
   `flush()` method is called in your implementation of `IMetricsSink`.
-* `sink-restart-attempts` &mdash; The number of times that a sink will attempt to
+* `sink-restart-attempts` --- The number of times that a sink will attempt to
   restart if it throws exceptions and dies. If you do not set this, the default
   is 0; if you set it to -1, the sink will attempt to restart forever.
 

diff --git a/website/content/docs/operators/configuration/config-intro.md b/website/content/docs/operators/configuration/config-intro.md
@@ -4,9 +4,9 @@ title: Intro to Heron Configuration
 
 Heron can be configured at two levels:
 
-1. **The system level** &mdash; System-level configurations apply to the whole
+1. **The system level** --- System-level configurations apply to the whole
 Heron cluster rather than to any specific topology.
-2. **The topology level** &mdash; Topology configurations apply only to a
+2. **The topology level** --- Topology configurations apply only to a
 specific topology and can be modified at any stage of the topology's
 [lifecycle](../../../concepts/topologies#topology-lifecycle).
 

diff --git a/website/content/docs/operators/deployment/configuration.md b/website/content/docs/operators/deployment/configuration.md
@@ -0,0 +1,93 @@
+# Configuring a Cluster
+
+To setup a Heron cluster, you need to configure a few files. Each file configures 
+a component of the Heron streaming framework.
+
+* **scheduler.yaml** --- This file specifies the required classes for launcher,
+scheduler, and for managing the topology at runtime. Any other specific parameters
+for the scheduler go into this file.
+
+* **statemgr.yaml** --- This file contains the classes and the configuration for state manager.
+The state manager maintains the running state of the topology as logical plan, physical plan,
+scheduler state, and execution state.
+
+* **uploader.yaml** --- This file specifies the classes and configuration for the uploader,
+which uploads the topology jars to storage. Once the containers are scheduled, they will 
+download these jars from the storage for running. 
+
+* **heron_internals.yaml** --- This file contains parameters that control
+how heron behaves. Tuning these parameters requires advanced knowledge of heron architecture and its
+components. For starters, the best option is just to copy the file provided with sample
+configuration. Once you are familiar with the system you can tune these parameters to achieve
+high throughput or low latency topologies.
+
+* **metrics_sinks.yaml** --- This file specifies where the run-time system and topology metrics
+will be routed. By default, the `file sink` and `tmaster sink` need to be present. In addition,
+`scribe sink` and `graphite sink` are also supported.
+
+* **packing.yaml** --- This file specifies the classes for `packing algorithm`, which defaults
+to Round Robin, if not specified.
+
+* **client.yaml** --- This file controls the behavior of the `heron` client. This is optional.
+
+# Assembling the Configuration
+
+All configuration files are assembled together to form the cluster configuration. For example,
+a cluster named `devcluster` that uses the Aurora for scheduler, ZooKeeper for state manager and
+HDFS for uploader will have the following set of configurations.
+
+## scheduler.yaml (for Aurora)
+
+```yaml
+# scheduler class for distributing the topology for execution
+heron.class.scheduler: com.twitter.heron.scheduler.aurora.AuroraScheduler
+
+# launcher class for submitting and launching the topology
+heron.class.launcher: com.twitter.heron.scheduler.aurora.AuroraLauncher
+
+# location of java 
+heron.directory.sandbox.java.home: /usr/lib/jvm/java-1.8.0-openjdk-amd64/
+
+# Invoke the IScheduler as a library directly
+heron.scheduler.is.service: False
+```
+
+## statemgr.yaml (for ZooKeeper)
+
+```yaml
+# zookeeper state manager class for managing state in a persistent fashion
+heron.class.state.manager: com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager
+
+# zookeeper state manager connection string
+heron.statemgr.connection.string:  "127.0.0.1:2181"
+
+# path of the root address to store the state in zookeeper  
+heron.statemgr.root.path: "/heron"
+
+# create the zookeeper nodes, if they do not exist
+heron.statemgr.zookeeper.is.initialize.tree: True
+```
+
+## uploader.yaml (for HDFS)
+```yaml
+# Directory of config files for hadoop client to read from
+heron.uploader.hdfs.config.directory:              "/home/hadoop/hadoop/conf/"
+
+# The URI of the directory for uploading topologies in the hdfs
+heron.uploader.hdfs.topologies.directory.uri:      "hdfs:///heron/topology/"
+```
+
+## packing.yaml (for Round Robin)
+```yaml
+# packing algorithm for packing instances into containers
+heron.class.packing.algorithm:    com.twitter.heron.packing.roundrobin.RoundRobinPacking
+```
+
+## client.yaml (for heron cli)
+```yaml
+# should the role parameter be required
+heron.config.role.required: false
+
+# should the environ parameter be required
+heron.config.env.required: false
+```
diff --git a/website/content/docs/operators/deployment/index.md b/website/content/docs/operators/deployment/index.md
@@ -2,12 +2,53 @@
 title: Deploying Heron
 ---
 
-Heron is designed to be run in clustered, scheduler-driven environments. It
-currently supports three scheduler options out of the box:
+Heron is designed to be run in clustered, scheduler-driven environments. It can
+be run in a `multi-tenant` or `dedicated` clusters. Furthermore, Heron supports 
+`multiple clusters` and a user can submit topologies to any of these clusters. Each
+of the cluster can use `different scheduler`. A typical Heron deployment is shown 
+in the following figure.
 
-* [Aurora](schedulers/aurora)
-* [Mesos](schedulers/mesos)
-* [Local scheduler](schedulers/local)
+<br />
+![Heron Deployment](/img/heron-deployment.png)
+<br/>
 
-To implement a new scheduler, see
-[Implementing a Custom Scheduler](../../contributors/custom-scheduler).
+A Heron deployment requires several components working together. The following must
+be deployed to run Heron topologies in a cluster:
+
+* **Scheduler** --- Heron requires a scheduler to run its topologies. It can 
+be deployed on an existing cluster running alongside other big data frameworks. 
+Alternatively, it can be deployed on a cluster of its own. Heron currently 
+supports several scheduler options:
+  * [Aurora](schedulers/aurora)
+  * [Local](schedulers/local)
+  * [Mesos](schedulers/mesos)
+  * [Slurm](schedulers/slurm)
+
+* **State Manager** --- Heron state manager tracks the state of all deployed
+topologies. The topology state includes its logical plan, 
+physical plan, and execution state. Heron supports the following state managers:
+  * [Local File System] (statemanagers/localfs)
+  * [Zookeeper] (statemanagers/zookeeper) 
+
+* **Uploader** --- The Heron uploader distributes the topology jars to the 
+servers that run them. Heron supports several uploaders 
+  * [HDFS] (uploaders/hdfs)
+  * [Local File System] (uploaders/localfs)
+  * [Amazon S3] (uploaders/s3)
+
+* **Metrics Sinks** --- Heron collects several metrics during topology execution.
+These metrics can be routed to a sink for storage and offline analysis.
+Currently, Heron supports the following sinks
+
+  * `File Sink`
+  * `Graphite Sink`
+  * `Scribe Sink`
+
+* **Heron Tracker** --- Tracker serves as the gateway to explore the topologies.
+It exposes a REST API for exploring logical plan, physical plan of the topologies and
+also for fetching metrics from them.
+
+* **Heron UI** --- The UI provides the ability to find and explore topologies visually.
+UI displays the DAG of the topology and how the DAG is mapped to physical containers 
+running in clusters. Furthermore, it allows the ability to view logs, take heap dump, memory 
+histograms, show metrics, etc.