Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/prometheus: systemd unit hardening of exporters #133189

Merged
merged 2 commits into from
Oct 7, 2021

Conversation

erdnaxe
Copy link
Member

@erdnaxe erdnaxe commented Aug 8, 2021

Motivation for this change

This increases the isolation of Prometheus exporters service.

I was unable to run OpenVPN, PyAirControl and Kea tests as they seem broken even without these changes.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS (NGINX and Blackbox only)
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Fits CONTRIBUTING.md.

@erdnaxe erdnaxe requested a review from WilliButz as a code owner August 8, 2021 21:15
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Aug 8, 2021
@erdnaxe erdnaxe force-pushed the prometheus-hardening branch from 49a6d9d to f3ca9e2 Compare August 8, 2021 21:18
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux labels Aug 8, 2021
Copy link
Member

@mweinelt mweinelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unable to run OpenVPN, PyAirControl and Kea tests as they seem broken even without these changes.

Kea fixed in #133201, please rebase.

Poking @lheckemann for OpenVPN

openvpn # [   11.470102] openvpn_exporter[784]: 2021/08/09 00:01:29 Starting OpenVPN Exporter
openvpn # [   11.486253] openvpn_exporter[784]: 2021/08/09 00:01:29 Listen address: 0.0.0.0:9176
openvpn: output:
Traceback (most recent call last):
  File "/nix/store/2c972bcrwpqz8lcmq94r5m4yda89a5sa-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 1065, in <module>
    run_tests(args.interactive)
  File "/nix/store/2c972bcrwpqz8lcmq94r5m4yda89a5sa-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 937, in run_tests
    test_script()
  File "/nix/store/2c972bcrwpqz8lcmq94r5m4yda89a5sa-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 1035, in test_script
openvpn # [   11.493818] openvpn_exporter[784]: 2021/08/09 00:01:29 Metrics path: /metrics
    exec(pathlib.Path(args.testscript).read_text(), globals())
  File "<string>", line 4, in <module>
  File "/nix/store/2c972bcrwpqz8lcmq94r5m4yda89a5sa-nixos-test-driver/bin/.nixos-test-driver-wrapped", line 482, in succeed
    raise Exception(
Exception: command `curl -sSf http://localhost:9176/metrics | grep 'openvpn_up{.*} 1'` failed (exit code 1)
cleaning up
killing openvpn (pid 9)
(0.00 seconds)
builder for '/nix/store/abcvk9zh8bfbgqq6q9d4afk36siaxjhq-vm-test-run-prometheus-openvpn-exporter.drv' failed with exit code 1

Poking @mupdt @urbas for py-air-control-exporter, which has failing tests and therefore fails to build.

============================= test session starts ==============================
platform linux -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /build/py-air-control-exporter-0.3.0, configfile: setup.cfg
plugins: cov-2.12.1
collected 17 items

test/test_app.py .                                                       [  5%]
test/test_main.py F..F.                                                  [ 35%]
test/test_metrics.py ...........                                         [100%]

=================================== FAILURES ===================================
__________________________________ test_help ___________________________________

monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7ffff55dc220>
capfd = <_pytest.capture.CaptureFixture object at 0x7ffff55a4310>

    def test_help(monkeypatch, capfd):
        monkeypatch.setattr(sys, "argv", ["app-name", "--help"])
        with pytest.raises(SystemExit) as ex_info:
            main()
        assert ex_info.value.code == 0
>       assert "Usage: app-name" in capfd.readouterr().out
E       AssertionError: assert 'Usage: app-name' in 'Usage: python -m pytest.app-name [OPTIONS]\n\nOptions:\n  --host TEXT                     The hostname of the air pur...                           requests.  [default: 9896]\n  --help                          Show this message and exit.\n'
E        +  where 'Usage: python -m pytest.app-name [OPTIONS]\n\nOptions:\n  --host TEXT                     The hostname of the air pur...                           requests.  [default: 9896]\n  --help                          Show this message and exit.\n' = CaptureResult(out='Usage: python -m pytest.app-name [OPTIONS]\n\nOptions:\n  --host TEXT                     The hostn...                  requests.  [default: 9896]\n  --help                          Show this message and exit.\n', err='').out
E        +    where CaptureResult(out='Usage: python -m pytest.app-name [OPTIONS]\n\nOptions:\n  --host TEXT                     The hostn...                  requests.  [default: 9896]\n  --help                          Show this message and exit.\n', err='') = <bound method CaptureFixture.readouterr of <_pytest.capture.CaptureFixture object at 0x7ffff55a4310>>()
E        +      where <bound method CaptureFixture.readouterr of <_pytest.capture.CaptureFixture object at 0x7ffff55a4310>> = <_pytest.capture.CaptureFixture object at 0x7ffff55a4310>.readouterr

test/test_main.py:15: AssertionError
____________________________ test_unknown_protocol _____________________________

monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7ffff55217c0>
capfd = <_pytest.capture.CaptureFixture object at 0x7ffff55bf070>

    def test_unknown_protocol(monkeypatch, capfd):
        """check that failure is reporter if an invalid protocol is provided"""
        monkeypatch.setattr(
            sys, "argv", ["app-name", "--host", "192.168.1.123", "--protocol", "foobar"]
        )
        with pytest.raises(SystemExit) as ex_info:
            main()
        assert ex_info.value.code != 0
>       assert "invalid choice: foobar" in capfd.readouterr().err
E       assert 'invalid choice: foobar' in "Usage: python -m pytest.app-name [OPTIONS]\nTry 'python -m pytest.app-name --help' for help.\n\nError: Invalid value for '--protocol': 'foobar' is not one of 'http', 'coap', 'plain_coap'.\n"
E        +  where "Usage: python -m pytest.app-name [OPTIONS]\nTry 'python -m pytest.app-name --help' for help.\n\nError: Invalid value for '--protocol': 'foobar' is not one of 'http', 'coap', 'plain_coap'.\n" = CaptureResult(out='', err="Usage: python -m pytest.app-name [OPTIONS]\nTry 'python -m pytest.app-name --help' for help.\n\nError: Invalid value for '--protocol': 'foobar' is not one of 'http', 'coap', 'plain_coap'.\n").err
E        +    where CaptureResult(out='', err="Usage: python -m pytest.app-name [OPTIONS]\nTry 'python -m pytest.app-name --help' for help.\n\nError: Invalid value for '--protocol': 'foobar' is not one of 'http', 'coap', 'plain_coap'.\n") = <bound method CaptureFixture.readouterr of <_pytest.capture.CaptureFixture object at 0x7ffff55bf070>>()
E        +      where <bound method CaptureFixture.readouterr of <_pytest.capture.CaptureFixture object at 0x7ffff55bf070>> = <_pytest.capture.CaptureFixture object at 0x7ffff55bf070>.readouterr

test/test_main.py:59: AssertionError

----------- coverage: platform linux, python 3.9.6-final-0 -----------
Name                                 Stmts   Miss  Cover
--------------------------------------------------------
py_air_control_exporter/app.py          10      0   100%
py_air_control_exporter/main.py          9      0   100%
py_air_control_exporter/metrics.py      75      0   100%
--------------------------------------------------------
TOTAL                                   94      0   100%

=========================== short test summary info ============================
FAILED test/test_main.py::test_help - AssertionError: assert 'Usage: app-name...
FAILED test/test_main.py::test_unknown_protocol - assert 'invalid choice: foo...
========================= 2 failed, 15 passed in 1.07s =========================

@erdnaxe erdnaxe force-pushed the prometheus-hardening branch 2 times, most recently from deccb3d to e82208b Compare August 9, 2021 07:21
Copy link
Member

@SuperSandro2000 SuperSandro2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add AF_UNIX as default instead of adding it to every other exporter?

@mweinelt
Copy link
Member

mweinelt commented Aug 9, 2021

Can we add AF_UNIX as default instead of adding it to every other exporter?

What's the point of hardening if you hand out permissions like candy?

@erdnaxe
Copy link
Member Author

erdnaxe commented Aug 9, 2021

Can we add AF_UNIX as default instead of adding it to every other exporter?

A Prometheus exporter should not require AF_UNIX (or AF_NETLINK for wireguard) by default as most exporters only expose a HTTP webpage. I prefer to add it explicitly to make clear which exporter requires it to collect data.

@erdnaxe erdnaxe force-pushed the prometheus-hardening branch from e82208b to 9d7419b Compare August 14, 2021 13:10
@erdnaxe
Copy link
Member Author

erdnaxe commented Aug 14, 2021

Poking @mupdt @urbas for py-air-control-exporter, which has failing tests and therefore fails to build.

Fixed by 06a8b08. I just rebased this PR to include this fix.

OpenVPN exporter is still broken: openvpn # [ 6.935036] openvpn_exporter[801]: 2021/08/14 13:11:20 Failed to scrape showq socket: parsing time "2021-08-14 13:11:20" as "Mon Jan 2 15:04:05 2006": cannot parse "2021-08-14 13:11:20" as "Mon"

@erdnaxe
Copy link
Member Author

erdnaxe commented Aug 14, 2021

The OpenVPN exporter has been broken with OpenVPN 2.5.0 (31cf796). I submitted an issue upstream: kumina/openvpn_exporter#48

Maybe we could:
option 1) mark the OpenVPN exporter broken for now and stop testing it,
option 2) patch the OpenVPN exporter.

What is your opinion?

@erdnaxe
Copy link
Member Author

erdnaxe commented Aug 20, 2021

OpenVPN exporter is now unmaintained: kumina/openvpn_exporter@1cd8158

Should we remove this exporter from NixOS?

@mweinelt
Copy link
Member

I think marking it as broken is the first step to removing it.

@erdnaxe
Copy link
Member Author

erdnaxe commented Aug 20, 2021

I think marking it as broken is the first step to removing it.

I can add meta.broken to the openvpn exporter package, but how can I disable the test? Should I comment it?

@erdnaxe
Copy link
Member Author

erdnaxe commented Aug 25, 2021

I have marked prometheus-openvpn-exporter as broken. I don't know what should be done with the corresponding test.

@mweinelt mweinelt requested a review from lheckemann August 25, 2021 16:38
@Ma27
Copy link
Member

Ma27 commented Sep 12, 2021

Is the openvpn-exporter the only thing that's missing from getting this merged? :)

@erdnaxe
Copy link
Member Author

erdnaxe commented Sep 12, 2021

Is the openvpn-exporter the only thing that's missing from getting this merged? :)

Yes I believe so. This exporter is broken and upstream is now unmaintained.

@erdnaxe
Copy link
Member Author

erdnaxe commented Sep 24, 2021

@mweinelt what change do you require?

@mweinelt
Copy link
Member

@GrahamcOfBorg eval

@lheckemann
Copy link
Member

I'm fine with the broken mark, we didn't have a module for it anyway (pending me finishing #98735 with a test)? I'll hopefully find some time to fix or replace it eventually, but that time is not now :/

Prometheus OpenVPN exporter has been broken since OpenVPN 2.5.0 changed
the format of the datetime to ISO8601. After submitting an issue to
upstream, the upstream decided to no longer maintain this exporter.
@erdnaxe erdnaxe force-pushed the prometheus-hardening branch from 48e8dda to 0b6148f Compare October 7, 2021 08:20
@mweinelt mweinelt merged commit 71dcb5e into NixOS:master Oct 7, 2021
@mweinelt
Copy link
Member

mweinelt commented Oct 7, 2021

Thank you!

@Baughn
Copy link
Contributor

Baughn commented Oct 18, 2021

This also breaks the node-exporter filesystem exporter for anything mounted under /home. ProtectHome needs to be set to false, or possibly some read-only state?

@mweinelt
Copy link
Member

mweinelt commented Oct 18, 2021

Same for the wifi module. Needs some more relaxing, likely AF_NETLINK in this case.

level=error ts=2021-10-18T16:54:24.939Z caller=collector.go:169 msg="collector failed" name=wifi duration_seconds=5.97e-05 err="failed to access wifi data: socket: address family not supported by protocol

@mweinelt
Copy link
Member

This also breaks the node-exporter filesystem exporter for anything mounted under /home. ProtectHome needs to be set to false, or possibly some read-only state?

The filesystem export works for /home for me with ProtectHome=true. Can you expand on your issue?

@Baughn
Copy link
Contributor

Baughn commented Oct 20, 2021

This also breaks the node-exporter filesystem exporter for anything mounted under /home. ProtectHome needs to be set to false, or possibly some read-only state?

The filesystem export works for /home for me with ProtectHome=true. Can you expand on your issue?

The problem affects any filesystem mounted under /home, i.e. /home/foo, /home/.zfs/snapshot automounts, or similar. It does not affect /home itself; that directory entry is part of /, so it's still readable even with ProtectHome enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants