Both the "automated" and "manual" procedures are based on the official Vault disaster recovery procedure
This method is based on an Ansible playbook.
- Place the snapshot to restore in the
artifact
subdirectory, under the namelatest-vault.snapshot
. - Run the restoration playbook:
ansible-playbook -i artifacts/vault-inventory.yml ansible-playbooks/vault-snapshot-restore.yaml # 【output】 # ... truncated ... # paas-staging-vault-1324e-gojrj : ok=36 changed=20 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0 # paas-staging-vault-1324e-ntxuj : ok=49 changed=24 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0 # paas-staging-vault-1324e-ribtp : ok=36 changed=20 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0 #
- Check your Vault cluster status
- you need a snapshot (from backup for ex.) available on one of the cluster peer
- you need to perform a copy of your root-token.txt: keep it on a secure storage!
- you need to perform a copy of your unseal keys (artifacts/vault-unseal-key-*.txt): keept them on a secure storage!
-
Reset each member Vault storage. On each hosts:
$ sudo systemctl stop vault-agent $ sudo systemctl stop vault-server $ sudo rm -rf /var/lib/vault/*
-
Now you can rebuild a new cluster. From this repository, run the following playbooks:
ansible-playbook -i artifacts/vault-inventory.yml ansible-playbooks/vault-cluster-bootstrap.yaml
From now, you have a brand new fully fonctionnal Vault cluster, but it doesn't contain your data anymore.
- From the host where a snapshot is available:
sudo -iu vault export VAULT_TOKEN=<content of new root-token.txt> vault operator raft list-peers # 【output】 # Node Address State Voter # ---- ------- ----- ----- # paas-staging-vault-addab-nvxka 194.182.170.142:8201 leader true # paas-staging-vault-addab-mklsr 194.182.169.48:8201 follower true # paas-staging-vault-addab-utxcg 89.145.162.86:8201 follower true export VAULT_ADDR=https://<address-of-the-leader>:8200 vault operator raft snapshot restore -force snap.snapshot
From now, your cluster contains the content of your snapshot, but it's back in sealed state.
-
Restore the original root-token and unseal key(s) you have backuped before running this procedure
-
Unseal the cluster and restart the vault-agent :
ansible-playbook -i artifacts/vault-inventory.yml ansible-playbooks/vault-cluster-unseal.yaml # 【output】 # ... truncated ... # paas-staging-vault-addab-mklsr : ok=6 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 # paas-staging-vault-addab-nvxka : ok=6 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 # paas-staging-vault-addab-utxcg : ok=6 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 # ansible-playbook -i artifacts/vault-inventory.yml ansible-playbooks/vault-cluster-tls-agent.yaml # 【output】 # ... truncated ... # paas-staging-vault-addab-mklsr : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 # paas-staging-vault-addab-nvxka : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 # paas-staging-vault-addab-utxcg : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 #