You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
using primary-replica create pr-primary, pr-replica and pr-replica-2 pods
run watch to monitor and switch master in case of failure.
Kill the pr-primary. watcher identifies the master failure and promotes pr-replica as master.
After this we can insert/delete database entries ( working file as expected)
now kill the pr-replica ( labelled as pr-primary after original pr-primary is killed)
watcher does not initiate failover
watcher logs for 1st failover(successful) and 2nd failover (does not failover)
INFO[2018-08-06T10:25:14Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:25:44Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:25:54Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:25:54Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-08-06T10:25:54Z] Executing pre-hook: /hooks/watch-pre-hook
INFO[2018-08-06T10:25:54Z] Processing Failover: Strategy - latest
INFO[2018-08-06T10:25:54Z] Deleting existing primary...
INFO[2018-08-06T10:25:54Z] Deleted old primary
INFO[2018-08-06T10:25:54Z] Choosing failover replica...
INFO[2018-08-06T10:25:54Z] Chose failover target (pr-replica)
INFO[2018-08-06T10:25:54Z] Promoting failover replica...
DEBU[2018-08-06T10:25:54Z] executing cmd: [/opt/cpm/bin/promote.sh] on pod pr-re plica in namespace default container: postgres
INFO[2018-08-06T10:25:54Z] Relabeling failover replica...
DEBU[2018-08-06T10:25:54Z] label: name
DEBU[2018-08-06T10:25:54Z] label: replicatype
INFO[2018-08-06T10:25:54Z] Executing post-hook: /hooks/watch-post-hook
INFO[2018-08-06T10:26:24Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:26:24Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:26:54Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:26:54Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:27:24Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:27:24Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:27:54Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:27:54Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:28:24Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:28:24Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:28:54Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:29:04Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:29:04Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-08-06T10:29:34Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:29:44Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:29:44Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-08-06T10:30:14Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:30:24Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:30:24Z] Could not reach 'pr-primary' (Attempt: 1)
The text was updated successfully, but these errors were encountered:
using primary-replica create pr-primary, pr-replica and pr-replica-2 pods
run watch to monitor and switch master in case of failure.
Kill the pr-primary. watcher identifies the master failure and promotes pr-replica as master.
After this we can insert/delete database entries ( working file as expected)
now kill the pr-replica ( labelled as pr-primary after original pr-primary is killed)
watcher does not initiate failover
watcher logs for 1st failover(successful) and 2nd failover (does not failover)
INFO[2018-08-06T10:25:14Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:25:44Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:25:54Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:25:54Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-08-06T10:25:54Z] Executing pre-hook: /hooks/watch-pre-hook
INFO[2018-08-06T10:25:54Z] Processing Failover: Strategy - latest
INFO[2018-08-06T10:25:54Z] Deleting existing primary...
INFO[2018-08-06T10:25:54Z] Deleted old primary
INFO[2018-08-06T10:25:54Z] Choosing failover replica...
INFO[2018-08-06T10:25:54Z] Chose failover target (pr-replica)
INFO[2018-08-06T10:25:54Z] Promoting failover replica...
DEBU[2018-08-06T10:25:54Z] executing cmd: [/opt/cpm/bin/promote.sh] on pod pr-re plica in namespace default container: postgres
INFO[2018-08-06T10:25:54Z] Relabeling failover replica...
DEBU[2018-08-06T10:25:54Z] label: name
DEBU[2018-08-06T10:25:54Z] label: replicatype
INFO[2018-08-06T10:25:54Z] Executing post-hook: /hooks/watch-post-hook
INFO[2018-08-06T10:26:24Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:26:24Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:26:54Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:26:54Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:27:24Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:27:24Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:27:54Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:27:54Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:28:24Z] Health Checking: 'pr-primary'
INFO[2018-08-06T10:28:24Z] Successfully reached 'pr-primary'
INFO[2018-08-06T10:28:54Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:29:04Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:29:04Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-08-06T10:29:34Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:29:44Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:29:44Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-08-06T10:30:14Z] Health Checking: 'pr-primary'
ERRO[2018-08-06T10:30:24Z] dial tcp 10.96.29.55:5432: i/o timeout
ERRO[2018-08-06T10:30:24Z] Could not reach 'pr-primary' (Attempt: 1)
The text was updated successfully, but these errors were encountered: