Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMetrics exporter does not respect openmetrics format #597

Closed
rmannibucau opened this issue Jul 22, 2020 · 15 comments
Closed

OpenMetrics exporter does not respect openmetrics format #597

rmannibucau opened this issue Jul 22, 2020 · 15 comments
Assignees

Comments

@rmannibucau
Copy link
Contributor

Hi,

Not sure I missed something but using https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format as a reference, it seems mp-metrics does not match the exporter format due to the numerous metric types.
To be portable you must only use counters and gauges - even histograms are not matching.
Wonder if metrics shouldn't be split in metrics-openmetrics (agree name sucks but it is what it is) and metrics which would import the openmetrics part.
Think it is ok to just have counters and gauges in the openmetrics sub-api (this is the main usage since all other ones can be compensated by the backend).

@trotman23
Copy link

trotman23 commented Oct 5, 2020

Also ran in to this today, specifically with the base_gc_total counter, but there are plenty of other duplicates.

...
# HELP base_gc_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
# TYPE base_gc_total counter
base_gc_total{name="PS MarkSweep1"} 5.0
...
# HELP base_gc_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
# TYPE base_gc_total counter
base_gc_total{name="PS Scavenge1"} 79.0

As pointed out above in the spec:
https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information

Lines with a # as the first non-whitespace character are comments. They are ignored unless the first token after # is either HELP or TYPE. Those lines are treated as follows: If the token is HELP, at least one more token is expected, which is the metric name. All remaining tokens are considered the docstring for that metric name. HELP lines may contain any sequence of UTF-8 characters (after the metric name), but the backslash and the line feed characters have to be escaped as \\ and \n, respectively. Only one HELP line may exist for any given metric name.

EDIT: My specific issue was fixed indirectly with a wildfly upgrade from 18->20, but YMMV.

@donbourne
Copy link
Contributor

donbourne commented Oct 13, 2020

Did some investigation on this.

MP Metric's histogram and timer use summary type in prometheus output format, but are missing the _sum metric.

https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format says summary includes:

# Finally a summary, which has a complex representation, too:
# HELP rpc_duration_seconds A summary of the RPC duration in seconds.
# TYPE rpc_duration_seconds summary
rpc_duration_seconds{quantile="0.01"} 3102
rpc_duration_seconds{quantile="0.05"} 3272
rpc_duration_seconds{quantile="0.5"} 4773
rpc_duration_seconds{quantile="0.9"} 9001
rpc_duration_seconds{quantile="0.99"} 76656
rpc_duration_seconds_sum 1.7560473e+07            // missing in below representation
rpc_duration_seconds_count 2693

MP Metrics spec (section 4.2.8 / 4.2.9):

# TYPE application_file_sizes_bytes summary
# HELP application_file_sizes_bytes Users file size
application_file_sizes_bytes_count 2037
application_file_sizes_bytes{quantile="0.5"} 4201
application_file_sizes_bytes{quantile="0.75"} 6175
application_file_sizes_bytes{quantile="0.95"} 13560
application_file_sizes_bytes{quantile="0.98"} 29643
application_file_sizes_bytes{quantile="0.99"} 31716
application_file_sizes_bytes{quantile="0.999"} 31716

@donbourne
Copy link
Contributor

Note also that MP Metrics uses only Counter, Gauge and Summary types. Prometheus exposition format has a concept of Histogram that isn't used in MP Metrics exporter (which just uses Summary and Gauge metrics for Histogram output). So given we use a subset of the Prometheus types I don't think we have any compatibility issue (since we don't introduce any types that Prometheus doesn't have).

@donbourne
Copy link
Contributor

@trotman23 , I agree with your point about multiple HELP lines -- that's pretty clear in the text you pasted.

@rmannibucau
Copy link
Contributor Author

@donbourne what about meters, timers, ...? I'm using https://github.com/eclipse/microprofile-metrics/blob/master/tck/rest/src/main/java/org/eclipse/microprofile/metrics/test/MpMetricTest.java#L510 as a reference. Assume you do a parser of the exporter string, it does not match openmetrics - or in a very surprising manner at least.

@jmartisk
Copy link
Contributor

The thing with duplicate HELP lines was a bug in SmallRye and therefore WildFly, and is now fixed. Apparently it wasn't detected by the TCK, so we might want to improve that. I filed #616

@donbourne
Copy link
Contributor

I confirmed we've already addressed the HELP line issue in Open Liberty as well.

@donbourne
Copy link
Contributor

@rmannibucau , the MP Metrics export format for openmetrics just relies on counts, gauges and summaries -- so should be interpretable by Prometheus. Is what you're saying that MP Metrics doesn't use the prescribed histogram output for a histogram metric?

@rmannibucau
Copy link
Contributor Author

@donbourne openmetrics does not define the format used by mp even if each line is compliant so a praser - potentially not prometheus since it is a spec - can fail being 100% compliant. MP should stay in the 100% defined area.

@donbourne
Copy link
Contributor

@rmannibucau , I feel like I'm still missing something. If we're just using gauges, counters and summaries, as defined by OpenMetrics, I feel like we are being 100% compliant (once we fix the missing sum in summary). can you give example of what we're missing?

@rmannibucau
Copy link
Contributor Author

@donbourne spec says "you will get A, B, C or D, E" and Mp does "you get A, B, C, D, E" which is not what spec says but includes it (the relationship is reversed IMHO, MP extended openmetrics where it should comply to it to be compatible).

@Channyboy
Copy link
Contributor

@rmannibucau

@donbourne what about meters, timers, ...? I'm using https://github.com/eclipse/microprofile-metrics/blob/master/tck/rest/src/main/java/org/eclipse/microprofile/metrics/test/MpMetricTest.java#L510 as a reference. Assume you do a parser of the exporter string, it does not match openmetrics - or in a very surprising manner at least.

That links to JSON output.

spec says "you will get A, B, C or D, E" and Mp does "you get A, B, C, D, E"

The Prometheus Exposition format page does not seem to indicate a limitation on the type of compatible metric types to be used?

Now that the sum value has been fixed, not sure what else is uncompliant?

@rmannibucau
Copy link
Contributor Author

@Channyboy get your point but it means what I'm complaining about: as a consumer you dont know what you get which is as not having a spec so if we reuse a well known format we should stick to what it enables and define and let the out of scope features stay togglable (off by default) only. So prometheus exporter shouldn't have all the not defined metric types by default at least.

@Channyboy
Copy link
Contributor

Channyboy commented Oct 13, 2021

@rmannibucau I think that is where the confusion where @donbourne and I have.

We're not sure what you mean by out of scope features (metric types)?

MP uses counter, gauge, summary.
Here a the sample outputs from the MP spec ( sample out put of each metric type in the prometheus/open-metrics format):
https://github.com/eclipse/microprofile-metrics/blob/master/spec/src/main/asciidoc/rest-endpoints.adoc#openmetrics-format

From https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format I'm not sure where the MP output would cause a conflict?

@rmannibucau
Copy link
Contributor Author

Hmm, from master I don't see the issues I hit (and the application which was relying on that dropped mp so can't check easily anymore :s). Let's consider it fixed for now and we'll see later. Maybe it was before openmetrics was fully embrassed (during prometheus -> openmetrics migration?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants