Skip to content

Commit

Permalink
[AutoOps] Reference AutoOps solution on troubleshooting pages (#119630)…
Browse files Browse the repository at this point in the history
… (#120182)

* Reference AutoOps on troubleshooting pages

* Integrate reviewer's feedback

(cherry picked from commit 70e5a67)

# Conflicts:
#	docs/reference/troubleshooting/common-issues/hotspotting.asciidoc

Co-authored-by: Arianna Laudazzi <[email protected]>
  • Loading branch information
leemthompo and alaudazzi authored Jan 15, 2025
1 parent 85f8a64 commit a97b542
Show file tree
Hide file tree
Showing 19 changed files with 76 additions and 1 deletion.
4 changes: 4 additions & 0 deletions docs/reference/monitoring/overview.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ All of the monitoring metrics are stored in {es}, which enables you to easily
visualize the data in {kib}. By default, the monitoring metrics are stored in
local indices.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

TIP: In production, we strongly recommend using a separate monitoring cluster.
Using a separate monitoring cluster prevents production cluster outages from
impacting your ability to access your monitoring data. It also prevents
Expand Down
4 changes: 4 additions & 0 deletions docs/reference/troubleshooting.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@
This section provides a series of troubleshooting solutions aimed at helping users
fix problems that an {es} deployment might encounter.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[troubleshooting-general]]
=== General
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ memory pressure if usage consistently exceeds 85%.
See https://www.youtube.com/watch?v=k3wYlRVbMSw[this video] for a walkthrough
of diagnosing circuit breaker errors.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-circuit-breaker-errors]]
==== Diagnose circuit breaker errors
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,7 @@ include::{es-ref-dir}/tab-widgets/troubleshooting/data/diagnose-unassigned-shard
See https://www.youtube.com/watch?v=v2mbeSd1vTQ[this video]
for a walkthrough of monitoring allocation health.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ usage falls below the <<cluster-routing-watermark-high,high disk watermark>>.
To achieve this, {es} attempts to rebalance some of the affected node's shards
to other nodes in the same data tier.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[[fix-watermark-errors-rebalance]]
==== Monitor rebalancing

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ depleted, {es} will reject search requests until more threads are available.

You might experience high CPU usage if a <<data-tiers,data tier>>, and therefore the nodes assigned to that tier, is experiencing more traffic than other tiers. This imbalance in resource utilization is also known as <<hotspotting,hot spotting>>.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-high-cpu-usage]]
==== Diagnose high CPU usage
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ High JVM memory usage can degrade cluster performance and trigger
taking steps to reduce memory pressure if a node's JVM memory usage consistently
exceeds 85%.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-high-jvm-memory-pressure]]
==== Diagnose high JVM memory pressure
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,13 @@ Computer link:{wikipedia}/Hot_spot_(computer_programming)[hot spotting]
may occur in {es} when resource utilizations are unevenly distributed across
<<modules-node,nodes>>. Temporary spikes are not usually considered problematic, but
ongoing significantly unique utilization may lead to cluster bottlenecks
and should be reviewed.
and should be reviewed.

See link:https://www.youtube.com/watch?v=Q5ODJ5nIKAM[this video] for a walkthrough of troubleshooting a hot spotting issue.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[detect]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ the remaining problems so management and cleanup activities can proceed.
See https://www.youtube.com/watch?v=v2mbeSd1vTQ[this video]
for a walkthrough of monitoring allocation health.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnose-cluster-status]]
==== Diagnose your cluster status
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ thread pool returns a `TOO_MANY_REQUESTS` error message.
* High <<index-modules-indexing-pressure,indexing pressure>> that exceeds the
<<memory-limits,`indexing_pressure.memory.limit`>>.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[check-rejected-tasks]]
==== Check rejected tasks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,8 @@ In order to fix this follow the next steps:

include::{es-ref-dir}/tab-widgets/troubleshooting/data/increase-cluster-shard-limit-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****


Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,9 @@ In order to fix this follow the next steps:

include::{es-ref-dir}/tab-widgets/troubleshooting/data/total-shards-per-node-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****



Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,8 @@ In order to fix this follow the next steps:

include::{es-ref-dir}/tab-widgets/troubleshooting/data/increase-tier-capacity-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****


4 changes: 4 additions & 0 deletions docs/reference/troubleshooting/diagnostic.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ https://discuss.elastic.co[Elastic Discuss] to minimize turnaround time.

See this https://www.youtube.com/watch?v=Bb6SaqhqYHw[this video] for a walkthrough of capturing an {es} diagnostic.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

[discrete]
[[diagnostic-tool-requirements]]
=== Requirements
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@

This guide describes how to fix common errors and problems with {es} clusters.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

<<fix-watermark-errors,Watermark errors>>::
Fix watermark errors that occur when a data node is critically low on disk space
and has reached the flood-stage disk usage watermark.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,8 @@ information about the problem:

include::{es-ref-dir}/tab-widgets/troubleshooting/snapshot/repeated-snapshot-failures-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****


Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,7 @@ The current shards capacity of the cluster is available in the
<<health-api-response-details-shards-capacity, health API shards capacity section>>.

include::{es-ref-dir}/tab-widgets/troubleshooting/troubleshooting-shards-capacity-widget.asciidoc[]

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ Elasticsearch balances shards across data tiers to achieve a good compromise bet
* disk usage
* write load (for indices in data streams)

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

Elasticsearch does not take into account the amount or complexity of search queries when rebalancing shards.
This is indirectly achieved by balancing shard count and disk usage.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@ logs.

* The master may appear busy due to frequent cluster state updates.

****
If you're using Elastic Cloud Hosted, then you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, real-time issue detection and resolution paths. For more information, refer to https://www.elastic.co/guide/en/cloud/current/ec-autoops.html[Monitor with AutoOps].
****

To troubleshoot a cluster in this state, first ensure the cluster has a
<<discovery-troubleshooting,stable master>>. Next, focus on the nodes
unexpectedly leaving the cluster ahead of all other issues. It will not be
Expand Down

0 comments on commit a97b542

Please sign in to comment.