-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define prometheus metrics to k8s #1021
Conversation
Co-authored-by: Roy Paulin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of those PromQL queries need to be tested on prometheus to make sure they represent what the user wants. You can easily set up Prometheus. You also need to have a basic understanding of prometheus metric types(counter, gauge...) as well as functions like increase, sum, et... and when to use them.
prometheus/adapter.yaml
Outdated
# name: | ||
# matches: "^vertica_sessions_running_counter$" | ||
# as: "vertica_sessions_running_counter" | ||
metricsQuery: 'sum(increase(vertica_sessions_running_counter[60m])) by (namespace, pod)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you played around with these queries on prometheus to check if they make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I ran the Vertica API to get the value and compared it with the Prometheus API result. For example, percentage value, like CPU/memory usage in an average of time, it gives for example 10174m means 10.174% per hour in average, using the avg_over_time function. I left the example in the code.
prometheus/adapter.yaml
Outdated
# name: | ||
# matches: "^vertica_cpu_aggregate_usage_percentage$" | ||
# as: "vertica_cpu_aggregate_usage_percentage" # If rename needed | ||
metricsQuery: 'avg_over_time(vertica_cpu_aggregate_usage_percentage[60m])' # 10174m means 10.174% per hour in average Ref: https://github.com/kubernetes-sigs/prometheus-adapter/blob/master/docs/walkthrough.md#quantity-values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each of these queries we need a detailed description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, but should we do the description in the developer doc, instead of here in adapter.yaml?
Do we expect the user to use the metrics we provided, or we provide example here and expect they can customize on their own?
Did you discuss with Cai about these ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an error from a local run of the test case:
logger.go:42: 16:50:45 | prometheus-sanity/10-deploy-prometheus | Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "prometheus-kube-prometheus-operator" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-namespace" must equal "kuttl-test-talented-mongrel": current value is "prometheus"
logger.go:42: 16:50:45 | prometheus-sanity/10-deploy-prometheus | make: *** [Makefile:667: deploy-prometheus] Error 1
case.go:399: failed in step 10-deploy-prometheus
case.go:401: command "cd ../../.. && make deploy-prometheus PROMETHEUS_NAMESPACE=$NAMESPACE" failed, exit status 2
We should not create extra namespace. All test related resources must be installed in kuttl namespace and when the test is finished the namespace will be deleted. It may not be possible to use a single namespace. We can create extra ones but with names that will not easily duplicate any existing ones. After the test case is ready, the extra namespaces can be deleted.
A thing I think would be interesting is to show an example of VerticaAutoscaler where the metrics are set, so it can be a reference, and we know how to properly use it. It can be a file in prometheus folder that contains an example for each. |
There is one in config./sample. We can add another one for custom metric. |
The current Prometheus integration test used the kuttl namespace, we didn't use extra namespace there. NAMESPACE=$1 |
@roypaulin @LiboYu2 I added an example for the autoscaler with custom metrics in another PR: #1033 |
This PR add custom metrics when deploy Prometheus adapter.