Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023.1: zed merge #823

Merged
merged 65 commits into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
30467e1
Add DWPD to Hardware Overview dashboard
technowhizz Sep 6, 2023
ab444c9
Add DWPD alerts
technowhizz Oct 13, 2023
f1293b9
Add release note
technowhizz Sep 6, 2023
0fc39a0
docs: Add in-place upgrade to RL9 migration
markgoddard Nov 8, 2023
1e218c7
docs: RL9 migration in-place updates
markgoddard Nov 9, 2023
31b0c40
docs: RL9 migration in place fix
markgoddard Nov 9, 2023
94e88e1
Bump nova images
jovial Nov 10, 2023
19e7740
Merge pull request #771 from stackhpc/bugfix/xena/mdev
markgoddard Nov 13, 2023
eec69ac
Merge branch 'stackhpc/yoga' into DWPD
dougszumski Nov 13, 2023
5c4270f
Fix Grafana HAProxy dashboard (again)
priteau Nov 14, 2023
5d66109
Merge pull request #782 from stackhpc/haproxy-dashboard-instance-label
markgoddard Nov 15, 2023
c872280
Adds Ubuntu Jammy & Rocky 9 CIS benchmark hardening playbooks (#685)
jovial Nov 15, 2023
09851d0
docs: Add info on purge-command-not-found.yml custom playbook
markgoddard Nov 15, 2023
be0d64e
Merge pull request #783 from stackhpc/zed-yoga-merge
markgoddard Nov 15, 2023
45ac990
CI: Don't fail fast on container image build job failure
markgoddard Nov 15, 2023
c156c76
Merge pull request #784 from stackhpc/yoga-docs-cnf
markgoddard Nov 15, 2023
b4edf2d
Merge pull request #760 from stackhpc/yoga-rl9-in-place-docs
markgoddard Nov 15, 2023
ed5176d
docs: Add overcloud host image to RL9 migration guide
markgoddard Nov 15, 2023
44ce4d8
Merge pull request #789 from stackhpc/yoga-rl9-doc-host-image
markgoddard Nov 15, 2023
31cfce1
Merge pull request #787 from stackhpc/wallaby-container-build-no-fail…
markgoddard Nov 15, 2023
0187d46
Rocky9: Add section on routing rules (#788)
jovial Nov 15, 2023
bfd9ebc
Further additions to RL9 migration docs
MoteHue Nov 16, 2023
02eb8cd
Merge branch 'stackhpc/yoga' into DWPD
dougszumski Nov 17, 2023
4fb099b
Merge pull request #792 from stackhpc/rl9-migration-docs-additions
markgoddard Nov 17, 2023
533ee57
Merge pull request #621 from stackhpc/DWPD
technowhizz Nov 17, 2023
2b7b00b
Add more services to the rabbitmq-reset playbook
priteau Nov 20, 2023
f6ceb41
Configure SELinux in permissive mode on RL9 hosts
priteau Nov 24, 2023
0a12762
Document new issues seen with Storage hosts
MoteHue Nov 24, 2023
5e50967
Converge on the right spelling of converge
MoteHue Nov 24, 2023
8e56c09
Use python3 -m venv for nova playbooks
MoteHue Nov 27, 2023
6628e10
Merge pull request #800 from stackhpc/new-virtualenv-command-for-nova…
MoteHue Nov 27, 2023
d28f1c0
Fixes various issues with the cis.yml playbook (#791)
jovial Nov 27, 2023
e8c7879
Merge pull request #794 from stackhpc/rabbitmq-reset-add-services
markgoddard Nov 27, 2023
d9dbd72
Enable hypervisor after RL9 compute migration
MoteHue Nov 27, 2023
5f00e91
Tox lint fixes
MoteHue Nov 27, 2023
5a78d18
Update doc/source/operations/rocky-linux-9.rst
MoteHue Nov 27, 2023
76e133a
Merge pull request #799 from stackhpc/rl9-migrations-ceph-issues
markgoddard Nov 28, 2023
047cb55
Merge pull request #793 from stackhpc/selinux-permissive
priteau Nov 28, 2023
1b44bac
Enable hypervisor after RL9 compute migration
MoteHue Nov 27, 2023
6082053
Merge branch 'enable-compute-after-rl9-migration' of https://github.c…
MoteHue Nov 28, 2023
80ba672
Fix cluster health in Grafana Elasticsearch dashboard
priteau Nov 28, 2023
5120dfe
Merge pull request #804 from stackhpc/grafana-elasticsearch-status
markgoddard Nov 29, 2023
58d6372
Merge pull request #801 from stackhpc/enable-compute-after-rl9-migration
markgoddard Nov 29, 2023
e9b0477
Add rekey-hosts.yml playbook
Alex-Welsh Nov 17, 2023
c527579
Rekey playbook misc improvements
Alex-Welsh Nov 17, 2023
0123c1f
Change rekey playbook to use existing ssh vars
Alex-Welsh Nov 20, 2023
6931e1c
Rework rekey-hosts.yml playbook
Alex-Welsh Nov 29, 2023
14caeb4
rekey-host.yml remove-key tag
Alex-Welsh Nov 30, 2023
6528e1d
Fix Wazuh agent playbook w/o using custom policies
MoteHue Dec 1, 2023
3c980be
Fix link to Release Train docs
priteau Dec 5, 2023
ab50878
Fix opensearch-migration command
priteau Dec 5, 2023
3703743
Merge pull request #813 from stackhpc/doc-fix
priteau Dec 5, 2023
dc7a34b
Merge pull request #812 from stackhpc/fix-wazuh-agent-custom-policies…
MoteHue Dec 5, 2023
1b4594a
Fix link to Release Train docs (really)
priteau Dec 5, 2023
b4fb79c
Merge pull request #814 from stackhpc/os-doc-fix
priteau Dec 5, 2023
f37a2b8
Merge pull request #815 from stackhpc/doc-fix
markgoddard Dec 5, 2023
22b7bc7
Merge pull request #796 from stackhpc/rekey
markgoddard Dec 6, 2023
5bb0e36
Merge stackhpc/wallaby into stackhpc/xena
markgoddard Dec 7, 2023
6991437
Merge stackhpc/xena into stackhpc/yoga
markgoddard Dec 7, 2023
49ba961
Merge pull request #821 from stackhpc/yoga-xena-merge
markgoddard Dec 7, 2023
ff03e1a
Merge stackhpc/yoga into stackhpc/zed
markgoddard Dec 7, 2023
2acf52a
Drop CentOS/Rocky 8 from CIS security hardening
markgoddard Dec 7, 2023
c4f5173
Remove SELinux overrides
markgoddard Dec 7, 2023
08fd1e8
Fix OpenSearch reno
markgoddard Dec 7, 2023
4b261c3
Merge stackhpc/zed into stackhpc/2023.1
markgoddard Dec 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/stackhpc-container-image-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ jobs:
timeout-minutes: 720
permissions: {}
strategy:
fail-fast: false
matrix: ${{ fromJson(needs.generate-tag.outputs.matrix) }}
needs:
- generate-tag
Expand Down
1 change: 1 addition & 0 deletions doc/source/configuration/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ the various features provided.
wazuh
vault
magnum-capi
security-hardening
17 changes: 17 additions & 0 deletions doc/source/configuration/release-train.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,23 @@ default apt repositories. This can be done on a host-by host basis by defining
the variables as host or group vars under ``etc/kayobe/inventory/host_vars`` or
``etc/kayobe/inventory/group_vars``.

For Ubuntu-based deployments, Pulp currently `lacks support
<https://github.com/pulp/pulp_deb/issues/419>`_ for certain types of content,
including i18n files and command-not-found indices. This breaks APT when the
``command-not-found`` package is installed:

.. code:: console

E: Failed to fetch https://pulp.example.com/pulp/content/ubuntu/jammy-security/development/dists/jammy-security/main/cnf/Commands-amd64 404 Not Found

The ``purge-command-not-found.yml`` custom playbook can be used to uninstall
the package, prior to running any other APT commands. It may be installed as a
:kayobe-doc:`pre-hook <custom-ansible-playbooks.html#hooks>` to the ``host
configure`` commands. Note that if used as a hook, this playbook matches all
hosts, so will run against the seed, even when running ``overcloud host
configure``. Depending on the stage of deployment, some hosts may be
unreachable.

For Rocky Linux based systems, package manager configuration is provided by
``stackhpc_dnf_repos`` in ``etc/kayobe/dnf.yml``, which points to package
repositories on the local Pulp server. To use this configuration, the
Expand Down
42 changes: 42 additions & 0 deletions doc/source/configuration/security-hardening.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
==================
Security Hardening
==================

CIS Benchmark Hardening
-----------------------

The roles from the `Ansible-Lockdown <https://github.com/ansible-lockdown>`_
project are used to harden hosts in accordance with the CIS benchmark criteria.
It won't get your benchmark score to 100%, but should provide a significant
improvement over an unhardened system. A typical score would be 70%.

The following operating systems are supported:

- Ubuntu 22.04
- Rocky 9

Configuration
--------------

Some overrides to the role defaults are provided in
``$KAYOBE_CONFIG_PATH/inventory/group_vars/overcloud/cis``. These may not be
suitable for all deployments and so some fine tuning may be required. For
instance, you may want different rules on a network node compared to a
controller. It is best to consult the upstream role documentation for details
about what each variable does. The documentation can be found here:

- `Ubuntu 22.04 <https://github.com/ansible-lockdown/UBUNTU22-CIS>`__
- `Rocky 9 <https://github.com/ansible-lockdown/RHEL9-CIS>`__

Running the playbooks
---------------------

As there is potential for unintended side effects when applying the hardening
playbooks, the playbooks are not currently enabled by default. It is recommended
that they are first applied to a representative staging environment to determine
whether or not workloads or API requests are affected by any configuration changes.

.. code-block:: console

kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cis.yml

20 changes: 13 additions & 7 deletions etc/kayobe/ansible/cis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,18 @@
hosts: overcloud
become: true
tasks:
- name: Remove /etc/motd
# See remediation in:
# https://github.com/wazuh/wazuh/blob/bfa4efcf11e288c0a8809dc0b45fdce42fab8e0d/ruleset/sca/centos/8/cis_centos8_linux.yml#L777
file:
path: /etc/motd
state: absent
- name: Ensure the cron package is installed on ubuntu
package:
name: cron
state: present
when: ansible_facts.distribution == 'Ubuntu'

- include_role:
name: ansible-lockdown.rhel8_cis
name: ansible-lockdown.rhel9_cis
when: ansible_facts.os_family == 'RedHat' and ansible_facts.distribution_major_version == '9'
tags: always

- include_role:
name: ansible-lockdown.ubuntu22_cis
when: ansible_facts.distribution == 'Ubuntu' and ansible_facts.distribution_major_version == '22'
tags: always
1 change: 1 addition & 0 deletions etc/kayobe/ansible/nova-compute-disable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- name: Set up openstack cli virtualenv
pip:
virtualenv: "{{ venv }}"
virtualenv_command: "/usr/bin/python3 -m venv"
name:
- python-openstackclient
state: latest
Expand Down
1 change: 1 addition & 0 deletions etc/kayobe/ansible/nova-compute-drain.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- name: Set up openstack cli virtualenv
pip:
virtualenv: "{{ venv }}"
virtualenv_command: "/usr/bin/python3 -m venv"
name:
- python-openstackclient
state: latest
Expand Down
1 change: 1 addition & 0 deletions etc/kayobe/ansible/nova-compute-enable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- name: Set up openstack cli virtualenv
pip:
virtualenv: "{{ venv }}"
virtualenv_command: "/usr/bin/python3 -m venv"
name:
- python-openstackclient
state: latest
Expand Down
6 changes: 3 additions & 3 deletions etc/kayobe/ansible/rabbitmq-reset.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
# Reset a broken RabbitMQ cluster.
# Also restarts OpenStack services which may be broken.
# Also restarts all OpenStack services using RabbitMQ.

- name: Reset RabbitMQ
hosts: controllers
Expand Down Expand Up @@ -65,7 +65,7 @@
tags:
- restart-openstack
tasks:
# The following services can have problems if the cluster gets broken.
# The following services use RabbitMQ.
- name: Restart OpenStack services
shell: >-
systemctl -a | egrep '(cinder|heat|ironic|keystone|magnum|neutron|nova)' | awk '{ print $1 }' | xargs systemctl restart
systemctl -a | egrep '(barbican|blazar|cinder|cloudkitty|designate|heat|ironic|keystone|magnum|manila|neutron|nova|octavia)' | awk '{ print $1 }' | xargs systemctl restart
117 changes: 117 additions & 0 deletions etc/kayobe/ansible/rekey-hosts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
# Playbook to rotate SSH keys across the cloud. By default it will rotate the
# standard keys used by kayobe/kolla-ansible, but it can be configured for any
# keys.

- name: Rekey hosts
hosts: overcloud,seed,seed-hypervisor,infra-vms
gather_facts: false
vars:
# The existing key is the key that is currently used to access overcloud hosts
existing_private_key_path: "{{ ssh_private_key_path }}"
existing_public_key_path: "{{ ssh_public_key_path }}"
# The new key is the key that will be generated by this playbook
new_private_key_path: "{{ ssh_private_key_path }}"
new_public_key_path: "{{ ssh_public_key_path }}"
new_key_type: "{{ ssh_key_type }}"
# The existing key will locally be moved to deprecated_key_path once it is replaced
deprecated_key_path: ~/old_ssh_key
rekey_users:
- stack
- kolla
rekey_remove_existing_key: false
tasks:
- name: Stat existing private key file
ansible.builtin.stat:
path: "{{ existing_private_key_path }}"
register: stat_result
delegate_to: localhost
run_once: true

- name: Fail when existing private key does not exist
ansible.builtin.fail:
msg: "No existing private key file found. Check existing_private_key_path is set correctly."
when:
- not stat_result.stat.exists
delegate_to: localhost
run_once: true

- name: Stat existing public key file
ansible.builtin.stat:
path: "{{ existing_public_key_path }}"
register: stat_result
delegate_to: localhost
run_once: true

- name: Fail when existing public key does not exist
ansible.builtin.fail:
msg: "No existing public key file found. Check existing_public_key_path is set correctly."
when:
- not stat_result.stat.exists
delegate_to: localhost
run_once: true

- name: Generate a new SSH key
community.crypto.openssh_keypair:
path: "{{ existing_private_key_path }}_new"
type: "{{ new_key_type }}"
delegate_to: localhost
run_once: true

- name: Set new authorized keys
vars:
lookup_path: "{{ existing_private_key_path }}_new.pub"
ansible.posix.authorized_key:
user: "{{ item }}"
state: present
key: "{{ lookup('file', lookup_path) }}"
loop: "{{ rekey_users }}"
become: true

- name: Locally deprecate existing key (private)
command: "mv {{ existing_private_key_path }} {{ deprecated_key_path }}"
delegate_to: localhost
run_once: true

- name: Locally deprecate existing key (public)
command: "mv {{ existing_public_key_path }} {{ deprecated_key_path }}.pub"
delegate_to: localhost
run_once: true

- name: Locally promote new key (private)
command: "mv {{ existing_private_key_path }}_new {{ new_private_key_path }}"
delegate_to: localhost
run_once: true

- name: Locally promote new key (public)
command: "mv {{ existing_private_key_path }}_new.pub {{ new_public_key_path }}"
delegate_to: localhost
run_once: true

- block:
- name: Stat old key file
ansible.builtin.stat:
path: "{{ deprecated_key_path }}.pub"
register: stat_result
delegate_to: localhost
run_once: true

- name: Fail when deprecated public key does not exist
ansible.builtin.fail:
msg: "No deprecated public key file found. Check deprecated_key_path is set correctly."
when:
- not stat_result.stat.exists
delegate_to: localhost
run_once: true

- name: Remove old key from hosts
vars:
lookup_path: "{{ deprecated_key_path }}.pub"
ansible.posix.authorized_key:
user: "{{ item }}"
state: absent
key: "{{ lookup('file', lookup_path) }}"
loop: "{{ rekey_users }}"
become: true
tags: remove-key
when: rekey_remove_existing_key | bool
13 changes: 10 additions & 3 deletions etc/kayobe/ansible/requirements.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,16 @@ collections:
version: 2.4.0
roles:
- src: stackhpc.vxlan
- name: ansible-lockdown.rhel8_cis
src: https://github.com/ansible-lockdown/RHEL8-CIS
version: 1.3.0
- name: ansible-lockdown.ubuntu22_cis
src: https://github.com/stackhpc/UBUNTU22-CIS
#FIXME: Waiting for https://github.com/ansible-lockdown/UBUNTU22-CIS/pull/174
# to be in a tagged release
version: bugfix/inject-facts
- name: ansible-lockdown.rhel9_cis
src: https://github.com/stackhpc/RHEL9-CIS
#FIXME: Waiting for https://github.com/ansible-lockdown/RHEL9-CIS/pull/115
# to be in a tagged release.
version: bugfix/inject-facts
- name: wazuh-ansible
src: https://github.com/stackhpc/wazuh-ansible
version: stackhpc
4 changes: 3 additions & 1 deletion etc/kayobe/ansible/wazuh-agent.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@
owner: wazuh
group: wazuh
block: sca.remote_commands=1
when: custom_sca_policies.files | length > 0
when:
- custom_sca_policies_folder.stat.exists
- custom_sca_policies.files | length > 0
notify:
- Restart wazuh-agent

Expand Down
Loading
Loading