elastic · elasticsearchmachine · Jan 15, 2025 · Jan 15, 2025
diff --git a/docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc b/docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc
@@ -78,35 +78,31 @@ A shard can become unassigned for several reasons. The following tips outline th
 most common causes and their solutions.
 
 [discrete]
-[[fix-cluster-status-reenable-allocation]]
-===== Re-enable shard allocation
+[[fix-cluster-status-only-one-node]]
+===== Single node cluster
 
-You typically disable allocation during a <<restart-cluster,restart>> or other
-cluster maintenance. If you forgot to re-enable allocation afterward, {es} will
-be unable to assign shards. To re-enable allocation, reset the
-`cluster.routing.allocation.enable` cluster setting.
+{es} will never assign a replica to the same node as the primary shard. A single-node cluster will always have yellow status. To change to green, set <<dynamic-index-number-of-replicas,number_of_replicas>> to 0 for all indices.
 
-[source,console]
-----
-PUT _cluster/settings
-{
-  "persistent" : {
-    "cluster.routing.allocation.enable" : null
-  }
-}
-----
-
-See https://www.youtube.com/watch?v=MiKKUdZvwnI[this video] for walkthrough of troubleshooting "no allocations are allowed".
+Therefore, if the number of replicas equals or exceeds the number of nodes, some shards won't be allocated.
 
 [discrete]
 [[fix-cluster-status-recover-nodes]]
 ===== Recover lost nodes
 
 Shards often become unassigned when a data node leaves the cluster. This can
-occur for several reasons, ranging from connectivity issues to hardware failure.
+occur for several reasons:
+
+* A manual node restart will cause a temporary unhealthy cluster state until the node recovers.
+
+* When a node becomes overloaded or fails, it can temporarily disrupt the cluster’s health, leading to an unhealthy state. Prolonged garbage collection (GC) pauses, caused by out-of-memory errors or high memory usage during intensive searches, can trigger this state. See <<fix-cluster-status-jvm,Reduce JVM memory pressure>> for more JVM-related issues.
+
+* Network issues can prevent reliable node communication, causing shards to become out of sync. Check the logs for repeated messages about nodes leaving and rejoining the cluster.
+
 After you resolve the issue and recover the node, it will rejoin the cluster.
 {es} will then automatically allocate any unassigned shards.
 
+You can monitor this process by <<cluster-health,checking your cluster health>>. The number of unallocated shards should progressively decrease until green status is reached.
+
 To avoid wasting resources on temporary issues, {es} <<delayed-allocation,delays
 allocation>> by one minute by default. If you've recovered a node and don’t want
 to wait for the delay period, you can call the <<cluster-reroute,cluster reroute
@@ -155,7 +151,7 @@ replica, it remains unassigned. To fix this, you can:
 
 * Change the `index.number_of_replicas` index setting to reduce the number of
 replicas for each primary shard. We recommend keeping at least one replica per
-primary.
+primary for high availability.
 
 [source,console]
 ----
@@ -166,7 +162,6 @@ PUT _settings
 ----
 // TEST[s/^/PUT my-index\n/]
 
-
 [discrete]
 [[fix-cluster-status-disk-space]]
 ===== Free up or increase disk space
@@ -187,6 +182,8 @@ If your nodes are running low on disk space, you have a few options:
 
 * Upgrade your nodes to increase disk space.
 
+* Add more nodes to the cluster.
+
 * Delete unneeded indices to free up space. If you use {ilm-init}, you can
 update your lifecycle policy to use <<ilm-searchable-snapshot,searchable
 snapshots>> or add a delete phase. If you no longer need to search the data, you
@@ -219,11 +216,39 @@ watermark or set it to an explicit byte value.
 PUT _cluster/settings
 {
   "persistent": {
-    "cluster.routing.allocation.disk.watermark.low": "30gb"
+    "cluster.routing.allocation.disk.watermark.low": "90%",
+    "cluster.routing.allocation.disk.watermark.high": "95%"
   }
 }
 ----
-// TEST[s/"30gb"/null/]
+// TEST[s/"90%"/null/]
+// TEST[s/"95%"/null/]
+
+[IMPORTANT]
+====
+This is usually a temporary solution and may cause instability if disk space is not freed up.
+====
+
+[discrete]
+[[fix-cluster-status-reenable-allocation]]
+===== Re-enable shard allocation
+
+You typically disable allocation during a <<restart-cluster,restart>> or other
+cluster maintenance. If you forgot to re-enable allocation afterward, {es} will
+be unable to assign shards. To re-enable allocation, reset the
+`cluster.routing.allocation.enable` cluster setting.
+
+[source,console]
+----
+PUT _cluster/settings
+{
+  "persistent" : {
+    "cluster.routing.allocation.enable" : null
+  }
+}
+----
+
+See https://www.youtube.com/watch?v=MiKKUdZvwnI[this video] for walkthrough of troubleshooting "no allocations are allowed".
 
 [discrete]
 [[fix-cluster-status-jvm]]
@@ -271,4 +296,4 @@ POST _cluster/reroute
 // TEST[s/^/PUT my-index\n/]
 // TEST[catch:bad_request]
 
-See https://www.youtube.com/watch?v=6OAg9IyXFO4[this video] for a walkthrough of troubleshooting `no_valid_shard_copy`.
+See https://www.youtube.com/watch?v=6OAg9IyXFO4[this video] for a walkthrough of troubleshooting `no_valid_shard_copy`.