Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support a manual failover being initiated by a user or application #39

Open
jmccormick2001 opened this issue Apr 3, 2018 · 8 comments
Assignees

Comments

@jmccormick2001
Copy link
Contributor

this feature would let crunchy-watch support a manual failover...perhaps a REST API...another application or an end user using curl for instance might want to cause a manual failover for schedule maintenance or other...they need an API whereby to invoke this function.

@GajaHebbar
Copy link

Is there any way to test the failover? As manual failover by killing the primary is not currently supported.

ERRO[2018-07-04T13:33:56Z] dial tcp: lookup pr-primary on 10.96.0.10:53: server misbehaving
ERRO[2018-07-04T13:33:56Z] Could not reach 'pr-primary' (Attempt: 1)
INFO[2018-07-04T13:33:56Z] Executing pre-hook: /hooks/watch-pre-hook
INFO[2018-07-04T13:33:56Z] Processing Failover: Strategy - latest
INFO[2018-07-04T13:33:56Z] Deleting existing primary...
INFO[2018-07-04T13:33:57Z] Deleted old primary
INFO[2018-07-04T13:33:57Z] Choosing failover replica...
INFO[2018-07-04T13:33:57Z] Chose failover target (pr-replica)

INFO[2018-07-04T13:33:57Z] Promoting failover replica...
DEBU[2018-07-04T13:33:57Z] executing cmd: [/opt/cpm/bin/promote.sh] on pod pr-replica in namespace kube-system container: postgres
INFO[2018-07-04T13:33:57Z] Relabeling failover replica...
DEBU[2018-07-04T13:33:57Z] label: name
DEBU[2018-07-04T13:33:57Z] label: replicatype
INFO[2018-07-04T13:33:57Z] Executing post-hook: /hooks/watch-post-hook

I see the logs indicating the failover is successful(but pr-replica still does not have write permission after primary got killed)

@davecramer
Copy link
Contributor

davecramer commented Jul 6, 2018 via email

@GajaHebbar
Copy link

GajaHebbar commented Jul 9, 2018 via email

@GajaHebbar
Copy link

GajaHebbar commented Jul 9, 2018 via email

@davecramer
Copy link
Contributor

davecramer commented Jul 9, 2018 via email

@GajaHebbar
Copy link

GajaHebbar commented Jul 10, 2018 via email

@davecramer
Copy link
Contributor

davecramer commented Jul 16, 2018

@GajaHebbar

One thing here I am not sure about is Watch does not execute promote.sh on
all replica. is that the problem?

Clearly that would be the problem if it is the case

@GajaHebbar
Copy link

GajaHebbar commented Aug 3, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants