Skip to content

Commit

Permalink
[AutoOps] Reference AutoOps solution on troubleshooting pages (elasti…
Browse files Browse the repository at this point in the history
…c#119630)

* Reference AutoOps on troubleshooting pages

* Integrate reviewer's feedback

(cherry picked from commit 70e5a67)

# Conflicts:
#	docs/reference/troubleshooting/common-issues/circuit-breaker-errors.asciidoc
#	docs/reference/troubleshooting/common-issues/hotspotting.asciidoc
  • Loading branch information
alaudazzi committed Jan 15, 2025
1 parent 10100a9 commit 6b8179f
Show file tree
Hide file tree
Showing 19 changed files with 78 additions and 0 deletions.
4 changes: 4 additions & 0 deletions docs/reference/monitoring/overview.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ All of the monitoring metrics are stored in {es}, which enables you to easily
visualize the data in {kib}. By default, the monitoring metrics are stored in
local indices.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

TIP: In production, we strongly recommend using a separate monitoring cluster.
Using a separate monitoring cluster prevents production cluster outages from
impacting your ability to access your monitoring data. It also prevents
Expand Down
4 changes: 4 additions & 0 deletions docs/reference/troubleshooting.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@
This section provides a series of troubleshooting solutions aimed at helping users
fix problems that an {es} deployment might encounter.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[troubleshooting-general]]
=== General
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,13 @@ By default, the <<parent-circuit-breaker,parent circuit breaker>> triggers at
95% JVM memory usage. To prevent errors, we recommend taking steps to reduce
memory pressure if usage consistently exceeds 85%.

See https://www.youtube.com/watch?v=k3wYlRVbMSw[this video] for a walkthrough
of diagnosing circuit breaker errors.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-circuit-breaker-errors]]
==== Diagnose circuit breaker errors
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,7 @@ include::{es-ref-dir}/tab-widgets/troubleshooting/data/diagnose-unassigned-shard
See https://www.youtube.com/watch?v=v2mbeSd1vTQ[this video]
for a walkthrough of monitoring allocation health.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ usage falls below the <<cluster-routing-watermark-high,high disk watermark>>.
To achieve this, {es} attempts to rebalance some of the affected node's shards
to other nodes in the same data tier.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[[fix-watermark-errors-rebalance]]
==== Monitor rebalancing

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ depleted, {es} will reject search requests until more threads are available.

You might experience high CPU usage if a <<data-tiers,data tier>>, and therefore the nodes assigned to that tier, is experiencing more traffic than other tiers. This imbalance in resource utilization is also known as <<hotspotting,hot spotting>>.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-high-cpu-usage]]
==== Diagnose high CPU usage
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ High JVM memory usage can degrade cluster performance and trigger
taking steps to reduce memory pressure if a node's JVM memory usage consistently
exceeds 85%.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-high-jvm-memory-pressure]]
==== Diagnose high JVM memory pressure
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,12 @@ may occur in {es} when resource utilizations are unevenly distributed across
ongoing significantly unique utilization may lead to cluster bottlenecks
and should be reviewed.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

See link:https://www.youtube.com/watch?v=Q5ODJ5nIKAM[this video] for a walkthrough of troubleshooting a hot spotting issue.

[discrete]
[[detect]]
==== Detect hot spotting
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ the remaining problems so management and cleanup activities can proceed.
See https://www.youtube.com/watch?v=v2mbeSd1vTQ[this video]
for a walkthrough of monitoring allocation health.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-cluster-status]]
==== Diagnose your cluster status
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ thread pool returns a `TOO_MANY_REQUESTS` error message.
* High <<index-modules-indexing-pressure,indexing pressure>> that exceeds the
<<memory-limits,`indexing_pressure.memory.limit`>>.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[check-rejected-tasks]]
==== Check rejected tasks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,8 @@ In order to fix this follow the next steps:

include::{es-ref-dir}/tab-widgets/troubleshooting/data/increase-cluster-shard-limit-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****


Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,9 @@ In order to fix this follow the next steps:

include::{es-ref-dir}/tab-widgets/troubleshooting/data/total-shards-per-node-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****



Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,8 @@ In order to fix this follow the next steps:

include::{es-ref-dir}/tab-widgets/troubleshooting/data/increase-tier-capacity-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****


4 changes: 4 additions & 0 deletions docs/reference/troubleshooting/diagnostic.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ https://discuss.elastic.co[Elastic Discuss] to minimize turnaround time.

See this https://www.youtube.com/watch?v=Bb6SaqhqYHw[this video] for a walkthrough of capturing an {es} diagnostic.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnostic-tool-requirements]]
=== Requirements
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@

This guide describes how to fix common errors and problems with {es} clusters.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

<<fix-watermark-errors,Watermark errors>>::
Fix watermark errors that occur when a data node is critically low on disk space
and has reached the flood-stage disk usage watermark.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,8 @@ information about the problem:

include::{es-ref-dir}/tab-widgets/troubleshooting/snapshot/repeated-snapshot-failures-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****


Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,7 @@ The current shards capacity of the cluster is available in the
<<health-api-response-details-shards-capacity, health API shards capacity section>>.

include::{es-ref-dir}/tab-widgets/troubleshooting/troubleshooting-shards-capacity-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ Elasticsearch balances shards across data tiers to achieve a good compromise bet
* disk usage
* write load (for indices in data streams)

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

Elasticsearch does not take into account the amount or complexity of search queries when rebalancing shards.
This is indirectly achieved by balancing shard count and disk usage.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ logs.

* The master may appear busy due to frequent cluster state updates.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

To troubleshoot a cluster in this state, first ensure the cluster has a
<<discovery-troubleshooting,stable master>>. Next, focus on the nodes
unexpectedly leaving the cluster ahead of all other issues. It will not be
Expand Down

0 comments on commit 6b8179f

Please sign in to comment.