-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry integration #699
Conversation
✅ Deploy Preview for opal-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
…ck latency of data update events feat(prometheus_metrics.py): create data_update_latency histogram to monitor latency of data update events
…l into prometheus_integration
…etrics to use opal_server.metrics.prometheus_metrics for better organization chore(requirements.txt): add prometheus_client to dependencies for metrics tracking functionality
…c to track updates per topic feat(prometheus_metrics.py): introduce data_update_count_per_topic counter for monitoring data updates by topic
… to enhance observability fix(api.py): increment policy bundle request count and measure latency for bundle generation fix(callbacks.py): observe size of changed directories in policy update notifications fix(task.py): track policy update count and latency when triggering policy watcher
Hey @psardana, thank you for this contribution! 💎 Can you please add documentation about the metrics and explain how to set it up? Notice that there are conflicts against the main branch, please make sure to rebase from master. Looking forward for this! 🙏 |
Thank you for the review! I have added commits for documentation, docker compose and fixed the label names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good! 🌟
I've left some comments about specific areas and improvements.
Upon review, some instrumented parts appear to align more with tracing rather than pure metrics. This led to some unnatural metrics and duplications, having separate latency
- count
and error
metrics.
To address this, we suggest exploring OpenTelemetry, which offers native Prometheus integration alongside robust tracing capabilities.
Here's a proposed mapping of the current metrics to OpenTelemetry:
opal_server_data_update
-> Traceopal_server_policy_update
-> Traceopal_server_policy_bundle_request
-> Traceopal_server_policy_bundle_size
-> Metricopal_server_active_clients
-> Metricopal_client_data_subscriptions
-> Metricopal_client_data_update_trigger
-> Traceopal_client_data_update_apply
-> Trace (new)opal_client_policy_update_apply
-> Trace (new)opal_client_policy_store_status
-> Metric
We believe this approach will provide a more comprehensive observability solution.
Please let us know your thoughts, and let's work together to enhance OPAL's observability 💎
docker/prometheus/prometheus/docker-compose-with-prometheus-metrics.yml
Outdated
Show resolved
Hide resolved
packages/opal-common/opal_common/monitoring/prometheus_metrics.py
Outdated
Show resolved
Hide resolved
packages/opal-common/opal_common/monitoring/prometheus_metrics.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look very good! Kudus on the work. I've left a few small details to fix and improve, let me know if you have question and when they are ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! 👏💎🎉
Fixes Issue
closes #701
Changes proposed
Check List (Check all the applicable boxes)
Screenshots
Note to reviewers