-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES Universe package on DC/OS Packet does not run #565
Comments
Please provide steps and configuration to recreate. We don't use Packet and don't test on DC/OS. Just plain old Mesos. |
Using this Terraform script: https://dcos.io/docs/1.7/administration/installing/cloud/packet/ (BTW, it seems to work correctly when installing DC/OS on Google Cloud, so it might be a specific Packet thing). |
Ok, thanks. I can't vouch for the DC/OS installer, as that hasn't been updated for a long time. But the marathon command should work. When you say expected ports, how are you specifying them? By default, ES lets mesos pick random ports from its pool. You can override this using the |
I don't specify a specific port. |
The issue seems to be that Elastic's java can't get the local address:
|
@philwinder how does this container attempt to get its address? Is it using a meta data service? |
in my case the problem is a static configured |
@zsmith928 Just do: dcos package repo add universe-jstabenow https://github.com/jstabenow/dcos-packages/archive/version-2.x.zip
dcos package install elasticsearch Here is my workaround for the wrong "publish_host" on executor: I replace only the argument of the framework by |
update: #569 |
Unfortunately, taking @jstabenow's helpful repo for a spin doesn't seem to help us. We're seeing the same thing --- in which Java complains about not knowing what the AWS-supplied hostname ip-10-1-23-254 is, and then failing to bind to local host. |
hey @jbirch that sounds after a similar problem. On Google there are many articles about network problems with Elasticsearch / Java. That's why I added the In my case has Elasticsearch elected the wrong interface for the Can you post the executor log and the results if you execute the following commands on your machine "ip-10-1-23-254"?: $ docker run -it --net=host elasticsearch:latest --default.network.publish_host="10.1.23.254"
$ docker run -it --net=host elasticsearch:latest --default.network.publish_host="ip-10-1-23-254"
$ docker run -it --net=host elasticsearch:latest --default.network.publish_host=$(hostname -i) That would be the right log: [2016-05-29 12:19:40,859][INFO ][transport ] [Storm] publish_address {10.1.23.254:9300}, bound_addresses {[::]:9300}
[2016-05-29 12:19:44,093][INFO ][cluster.service ] [Storm] new_master {Storm}{N93bF9aPT1SEaqsHGsF6Eg}{10.1.23.254}{10.1.23.254:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-05-29 12:19:44,210][INFO ][http ] [Storm] publish_address {10.1.23.254:9200}, bound_addresses {[::]:9200} And can you post the available Hope we can find the problem and the right setting for you. |
Hey @jstabenow, thanks for taking the time to reply on the weekend to a stranger. I appreciate it. With respect to your commands: "10.1.23.19: Comes up and binds to the given IP.
"$(hostname -i)":
"$LIPPROCESS_IP": Starts up and binds to 198.51.100.1. The issue here is that I've got no issues starting elasticsearch:latest in DC/OS. It'll bind to 198.51.100.1 and start, much the same as if I didn't provide the --default.network.publish_host argument. My hope was that your package would help with mesos/elasticsearch-scheduler having a bad time. Regarding an existing env, here's the output of
|
Hey @jbirch $ docker run -it --net=host elasticsearch:latest --default.network.publish_host=${LIBPROCESS_IP}
$ docker run -it --net=host elasticsearch:latest --default.network.publish_host=${HOST} This two ENV should work: HOST=10.1.23.19
LIBPROCESS_IP=10.1.23.19 |
Hi all. Thanks @jstabenow for continuing to help out on this. To answer a previous question:
Remember that you can pass your own settings file and that the ES containers can be overridden. So I would oppose any core code changes that could otherwise be achieved by this. |
Hey @philwinder |
Hi @philwinder, We still have the case where Note that the thing that fails to do the binding is https://github.com/mesos/elasticsearch/blob/1.0.1/commons/src/main/java/org/apache/mesos/elasticsearch/common/util/NetworkUtils.java:30, not Elasticsearch itself. Noting that there's myriad deployment options of the underlying platform on which Caveat here being that maybe it's actually totally fine and my environment is just screwed up :) |
@jbirch I did all my manual testing on AWS, so I'm surprised there's a problem here. But I used vanilla Mesos, not DCOS, so I assume it's some difference there. Can you post the log that is showing the error? That might help decide what to do. Thanks, Phil |
I'm almost certain it's an issue on our end, and isn't indicative of the package itself generally "not working". I would expect something like Tentatively let's call this one a layer 8 problem and I'll try and get things shored up on our end. It really does look more like "DNS isn't 100%" rather than " |
It keeps on deploying, waiting, failing on DC/OS 1.7.x on Packet. It seems to be unable to bind to expected ports.
The text was updated successfully, but these errors were encountered: