We want to be able to generate the metering reports based on the operator specific Prometheus metrics. In order to be able to do that, operators must be instrumented to expose those metrics, and the operator-sdk should make this as easy as possible. The goal is to have the metering happen based on the usage of each individual operator. Metrics will be based on objects managed by the particular operator.
To follow both the Prometheus instrumentation best practices as well as the official Kubernetes instrumentation guide, the metrics will have the following format:
crd_kind_info{namespace="namespace",crdkind="instance-name"} 1
example metric for the memchached-operator would look like this:
memcached_info{namespace="default",memcached="example-memcached"} 1
The solution makes use of Kubernetes list/watch to populate a Prometheus metrics registry, kube-state-metrics implements its own registry for performance reasons. kube-state-metrics is used because it solves exactly the same problem we are facing but it does it for upstream known resources. The operator-sdk can re-use its functionality to perform the same thing but with custom resources. The kube-state-metrics library can only be used for constant (/static) metrics, metrics that are immutable and thereby entirely regenerated on change. This is perfect for our above mentioned use-case. It is not meant to do e.g. counting in performance critical code paths. Thereby an operator would need kube-state-metrics library for exposing the amount of custom resources that it manages and its details and Prometheus client_golang to expose metrics of its own internals e.g. count of reconciliation loops.
// NewCollectors returns a collection of metrics in the namespaces provided, per the api/kind resource.
// The metrics are registered in the custom generateStore function that needs to be defined.
func NewCollectors(
client *Client,
namespaces []string,
api string,
kind string,
metricsGenerator func(obj interface{}) []*metrics.Metric) (collectors []*kcoll.Collector)
// ServeMetrics takes in the collectors that were created and port number on which the metrics will be served.
func ServeMetrics(collectors []*kcoll.Collector, portNumber int) {
Note: Due to taking advantage of kube-state-metrics functions and interfaces we cannot use the prometheus/client_golang and we need to register it in the same way as kube-state-metrics does, and expose the /metrics
and serve it on a port (port :8389/metrics
for example). For that we will need to also create a Service object or rather update the current Service object.
Below is how roughly an example for kube-state-metrics implementation will look like.
User will have all the below code already generated and included as part of the main.go
file:
c := metrics.NewCollectors(client, []string{"default"}, resource, kind, MetricsGenerator)
metrics.ServeMetrics(c)
with the MetricsGenerator
function living in the users pkg/metrics
package:
var (
descMemInfo = ksmetrics.NewMetricFamilyDef(
"memcached_info",
"The information of the resource instance.",
[]string{"namespace", "memcached"},
nil,
)
)
func MetricsGenerator(obj interface{}) []*ksmetrics.Metric {
ms := []*ksmetrics.Metric{}
crdp := obj.(*unstructured.Unstructured)
crd := *crdp
lv := []string{crd.GetNamespace(), crd.GetName()}
m, err := ksmetrics.NewMetric(descMemInfo.Name, descMemInfo.LabelKeys, lv, float64(1))
if err != nil {
fmt.Println(err)
return ms
}
ms = append(ms, m)
return ms
}
In the future if the agreed on kube-state-metrics restructure happens (see kubernetes/kube-state-metrics#579) we can get rid of some of the duplicated functions. But that will probably take a few months and our user facing interface should not change as a result.