Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smart health related checks are executed against hardware raids #357

Open
err404r opened this issue Nov 29, 2024 · 4 comments · May be fixed by prometheus-community/smartctl_exporter#260
Open

Comments

@err404r
Copy link

err404r commented Nov 29, 2024

New revision(rev 84) of Hardware observer introduced checks for smart drives, unfortunately collector is trying to run checks against all block devices including hardware raids. They are failing and generating false positive alerts.

Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/SOLENG-946.

This message was autogenerated

@aieri
Copy link
Contributor

aieri commented Dec 5, 2024

@err404r yes, the exporter doesn't really distinguish between logical and physical block devices and exports metrics for anything listed by smartctl --scan. Which alerts are firing? Which metrics are triggering the alerts?

@err404r
Copy link
Author

err404r commented Dec 10, 2024

Alert name SmartHealthStatusFail
Rule:
smartctl_device_smart_status{juju_application="hardware-observer",juju_model="ceph",juju_model_uuid="bdb2f523-0ca5-463b-8061-8686da1151db"} == 0

@Deezzir
Copy link
Contributor

Deezzir commented Dec 13, 2024

@err404r Can you confirm that the alerts which are firing miss the device label?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants