Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add monitoring diagram + solve todos #5207

Merged
merged 1 commit into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 114 additions & 0 deletions docs/src/main/draw.io/delta/prometheus_monitoring.drawio

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,14 @@ Nexus Delta relies on @link:[Kamon](https://kamon.io/) to also collect metrics a

To enable Kamon in Delta, the `KAMON_ENABLED` env variable must be set to true.

// TODO add mentions to the Kibana dashboard
To monitor Nexus write activity like resource and file creation/updates, a dashboard is available in the
@link:[Nexus repo](https://github.com/BlueBrain/nexus/blob/$git.branch$/kibana/event-metrics/general.ndjson).

**Logs:**
Nexus Delta relies on @link:[Logback](https://logback.qos.ch/) for logs which one of the popular logging
frameworks on the JVM.

Log back provides:
Logback provides:

* Reloading the configuration while the application is running
* Control the output of the logs, opting for JSON helps for the integration with Filebeats and Elasticsearch
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -250,12 +250,12 @@ Monitoring a Nexus deployment is important to identify its performance and its h
While different approaches are possible depending on how and where Nexus is deployed, an example of monitoring stack can be:

* @link:[Prometheus](https://prometheus.io/) which is a open-source tool for monitoring
* @link:[Blackbox](https://github.com/prometheus/blackbox_exporter) can be used to probe endpoints over
* @link:[Blackbox](https://github.com/prometheus/blackbox_exporter) can be used to probe endpoints over http
* @link:[Alert Manager](https://github.com/prometheus/alertmanager) for alerts
* @link:[Grafana](https://grafana.com/grafana/) for visualization
* @link:[Filebeats](https://www.elastic.co/beats/filebeat), @link:[Elasticsearch](https://www.elastic.co/elasticsearch) and @link:[Kibana](https://www.elastic.co/kibana) to collect and visualize Delta logs

Prometheus has been the historical solution for monitoring Nexus at BBP for its popularity and its versatility as it allows to
monitor all components.

// TODO add a diagram of how the Prometheus ecosystem work.
![Monitoring with Prometheus](../assets/prometheus-monitoring.png)
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,4 @@ operation are completed without forgetting to remove the environment variables.

## Fusion

The data model for Studios has changed in 1.7.

//TODO Add Studio migration instructions here
The data model for Studios has changed in 1.7.
3 changes: 0 additions & 3 deletions docs/src/main/paradox/docs/releases/v1.8-release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,3 @@ The goal of this release was to improve the user experience and functionality of
11. App Configuration:
- The Application manager has the ability to select his favourites images for different top-level pages and the video in the Login page.

## Nexus forge

TODO
2 changes: 0 additions & 2 deletions docs/src/main/paradox/docs/releases/v1.9-release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@
> With v1.9, Nexus Delta also changes its underlying runtime, allowing to it to work properly,
> switching from @link:[Monix](https://monix.io/) which is not maintained anymore to @link:[Cats Effect](https://typelevel.org/cats-effect/).

//TODO adjust the end date

For the detailed list of updates in this release, see the
@link:[list of addressed issues](https://github.com/BlueBrain/nexus/issues?&q=is%3Aissue+is%3Aclosed+created%3A2023-06-15..2023-12-14+){ open=new }
since v1.8.
Expand Down