-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single server. Recovery from image on server with different IP fails. #758
Comments
Hello, I understand that hardcoding IP address in configuration files is not really a good practice but I guess we cannot assume that every deployment involving multiple nodes will have full DNS resolution working with every node. On another side, nothing prevent you to run the playbook again after infrastructure changes, so that IP addresses in configuration files can be eventually refreshed, but I understand it can be tedious to do especially after recovering from unplanned outages. I feel we should take in consideration this issue but atm not sure if the direction we should take is really to handle hosts reference in two different possibile ways (ip address or fqdn when dns resolution is available) |
It is not just a recovery scenario where recovery activities are expected, but in our case the set up of an autoscaling mechanism to automatically fail over to another server. Executing the playbooks again after a long time, on a recovered server which, no matter how much we strive to, no one can guarantee they will still run without error, is not an option in an automated scenario like that. The AMI must boot into a new instance and be operative. The alternative is to develop a playbook to reconfigure all IPs in all places to run as part of the new server boot. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Just for documentation. This side issue seems to be addressed in #796 |
Bug description
In scenarios where a single server installation is recovered from an image of the server into a new machine with different IP, services do not work because they are configured with the old server's IP hardcoded in configuration files.
This is a scenario that affects particularly cloud environments where backups are taken using snapshots, or autoscaling systems are used to provide fault tolerance.
The playbooks seem to want to preserve the pre-1.2.0 behavior where single server installations defaulted to use 127.0.0.1 as IP for all the components. But the Jinja expression used does not behave that way, as the filter includes the obtention of the server's IP from hostvars, which is invariably available and therefore always used.
We have corrected the issue by introducing in the inventory a new custom variable for hosts, named
dns
to explicitly declare the host name to be used when configuring connectivity between components, in the understanding that it can differ from the ansible inventory host name in some use cases; and in its absence defaulting to the inventory hostname, which in single server deployments islocalhost
. As an example applied toroles/common/defaults/main.yml
:In the case of
repo_hosts
the logic is a bit different:The
repo_host
variable defined as the address of the first repository host in the inventory seems to not be used in the playbooks.There are other instances of files where IPs instead of DNS names are used, like on the template for
custom-slingshot-application-context.xml
. We generally consider a best practice to use DNS names in all configurations, to protect the deployment from IP changes that can happen for multiple reasons, also in the traditional on-prem environments, but most specially in cloud set ups.Target OS
Any
Host OS
Any
Playbook version
1.2.0 onwards
Ansible error
No ansible error related
Ansible context
Not relevant
ansible --version
Not relevant
ansible-config dump --only-changed
Not relevant
ansible-inventory -i your_inventory_file --graph
Default inventory_local.yml
pip list
Not relevant
The text was updated successfully, but these errors were encountered: