Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Improve Prometheus Metrics #1338

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

dennypradipta
Copy link
Contributor

@dennypradipta dennypradipta commented Dec 27, 2024

Monika Pull Request (PR)

What feature/issue does this PR add

This PR clears up #1335 and #1336

How did you implement / how did you fix it

I made some new and improve existing exported metrics for Prometheus. The details are below:

Metric Name Type Purpose Labels
monika_alerts_triggered Counter Collect count of alerts triggered by a probe id, name, url, method, alertQuery
monika_alerts_triggered_total Counter Collect total count of alerts triggered -
monika_probes_running Gauge Indicates whether a probe is running (1) or idle (0) id
monika_probes_running_total Gauge Collect total count of probes currently running checks -
monika_probes_status Gauge Indicates the current status of a probe: 0 = DOWN, 1 = UP id, name, url, method
monika_probes_total Gauge Collect total number of probes configured -
monika_request_response_size_bytes Gauge Collect size of the response in bytes id, name, url, method, statusCode, result
monika_request_response_time_seconds Histogram Collect duration of probe request in seconds id, name, url, method, statusCode, result
monika_request_status_code_info Gauge Collect HTTP status code of the probe request id, name, url, method

How to test

  1. Use this config
probes:
  - id: 'mock-1'
    name: 'local-test-test'
    requests:
      - url: http://127.0.0.1:7000
    alerts:
      - assertion: response.status < 200 or response.status > 308
        message: HTTP Status is not 200

  - id: 'http-1'
    name: 'status-200-test'
    requests:
      - url: https://httpbin.org/status/200
        method: GET
    alerts:
      - assertion: response.status < 200 or response.status > 308
        message: HTTP Status is not 200
      - assertion: response.time > 2000
        message: Too slow
notifications:
  - id: '1'
    type: 'desktop'
  1. Run monika using npm run start -- --prometheus 9090
  2. Open localhost:9090/metrics in browser
image

@dennypradipta
Copy link
Contributor Author

I've made some changes to the metrics, the docs are also updated.

I've tested this by creating a Grafana dashboard and see how it would look:
Screenshot 2025-01-03 at 17 40 44
Screenshot 2025-01-03 at 17 40 48
Screenshot 2025-01-03 at 17 40 54
Screenshot 2025-01-03 at 17 40 57

You can try to connect your Prometheus to Grafana though, I made a simple repo here:
https://github.com/dennypradipta/monika-dashboard

@sapiderman
Copy link
Contributor

sapiderman commented Jan 20, 2025

Theres some metrics without documentation, can we please add them:

just an example among others
example

nodejs_heap_space_size_available_bytes
nodejs_version_info
nodejs_gc_duration_seconds

Copy link
Contributor

@sapiderman sapiderman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments on documentations, but LGTM otherwise.

docs/src/pages/guides/cli-options.md Outdated Show resolved Hide resolved
docs/src/pages/guides/cli-options.md Outdated Show resolved Hide resolved
@dennypradipta
Copy link
Contributor Author

Theres some metrics without documentation, can we please add them:

just an example among others example

nodejs_heap_space_size_available_bytes
nodejs_version_info
nodejs_gc_duration_seconds

For this, this is basically default metrics by PromClient. I will add documentations for this

@haricnugraha
Copy link
Contributor

I got the following error.
Screenshot 2025-01-24 at 11 45 39 AM

@dennypradipta
Copy link
Contributor Author

I got the following error. Screenshot 2025-01-24 at 11 45 39 AM

How can I reproduce this error?

@dennypradipta
Copy link
Contributor Author

I got the following error. Screenshot 2025-01-24 at 11 45 39 AM

Found the solution.

npm ci
npm run build -w packages/notification

Then run it again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The metric monika_probe_result displaying incorrectly Incorrect prometheus monika_probe_running_total metric
3 participants