-
Notifications
You must be signed in to change notification settings - Fork 32
Redis instances shut down when scheduler restarted #56
Comments
It looks to me like the failover_timeout logic is not quite right in mesoslib.go see here:
I think the logic of using the failover timeout in GetFrameworkID is not correct: |
See this PR which fixes the behaviour when a scheduler is restarted: |
the PR looks good to me. |
First of all thanks a lot for the contribution. Glad to hear that you are using mr-redis. I think mr-redis needs the leader-follower logic to be implemented so that more than one instance of this scheduler can be run at once for high-availability. Would you like to contribute that functionality? |
@eastlondoner |
Hi @dhilipkumars We're running it by installing the package from universe then going into Marathon and changing the docker image to point at out docker image: https://hub.docker.com/r/tractableio/mr-redis/ |
We also had to change the docker client API version setting in mr-redis to match the version of Docker running on our Agents before we built that docker image. |
I guess I could push a new version to the universe, but I wouldn't want to push something that includes code changes that aren't in this (mainline) repo. Furthermore for the latest DC/OS I think that the docker API should be 1.25! |
n.b. this is the commit I am concerned about: |
@eastlondoner Thank you |
Hi @dhilipkumars I have the same problem that is discussed in this issue, I would like to access the docker image https://hub.docker.com/r/tractableio/mr-redis/ to do some tests. Thank you |
@daguero I don't work at Tractable anymore and I recall I did some hacky things that I didn't want to publish to make it work. |
@eastlondoner OK, Thanks for your help, I'll prove it |
We have built mr-redis from the latest master and are running it on DC/OS (using zookeeper rather than etcd).
The basics work ok but when the scheduler is restarted the existing redis instances shut down and don't come back.
If you call the /STATUS endpoint it says that the redis instances are up - but looking in mesos they're not running any more
The text was updated successfully, but these errors were encountered: