Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Improve Prometheus Metrics #1338

Merged
merged 15 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 11 additions & 7 deletions docs/src/pages/guides/cli-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,13 +245,17 @@ Then you can scrape the metrics from `http://localhost:3001/metrics`.

Monika exposes [Prometheus default metrics](https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors), [Node.js specific metrics](https://github.com/siimon/prom-client/tree/master/lib/metrics), and Monika probe metrics below.

| Metric Name | Type | Purpose | Label |
| -------------------------------------- | --------- | -------------------------------------------- | ------------------------------------------- |
| `monika_probes_total` | Gauge | Collect total probe | - |
| `monika_request_status_code_info` | Gauge | Collect HTTP status code | `id`, `name`, `url`, `method` |
| `monika_request_response_time_seconds` | Histogram | Collect duration of probe request in seconds | `id`, `name`, `url`, `method`, `statusCode` |
| `monika_request_response_size_bytes` | Gauge | Collect size of response size in bytes | `id`, `name`, `url`, `method`, `statusCode` |
| `monika_alert_total` | Counter | Collect total alert triggered | `id`, `name`, `url`, `method`, `alertQuery` |
| Metric Name | Type | Purpose | Labels |
| -------------------------------------- | --------- | --------------------------------------------------------- | ----------------------------------------------------- |
| `monika_alerts_triggered` | Counter | Collect count of alerts triggered by a probe | `id`, `name`, `url`, `method`, `alertQuery` |
| `monika_alerts_triggered_total` | Counter | Collect total count of alerts triggered | - |
| `monika_probes_running` | Gauge | Indicates whether a probe is running (1) or idle (0) | `id` |
dennypradipta marked this conversation as resolved.
Show resolved Hide resolved
| `monika_probes_running_total` | Gauge | Collect total count of probes currently running checks | - |
| `monika_probes_status` | Gauge | Indicates the current status of a probe: 0 = DOWN, 1 = UP | `id`, `name`, `url`, `method` |
| `monika_probes_total` | Gauge | Collect total number of probes configured | - |
| `monika_request_response_size_bytes` | Gauge | Collect size of the response in bytes | `id`, `name`, `url`, `method`, `statusCode`, `result` |
| `monika_request_response_time_seconds` | Histogram | Collect duration of probe request in seconds | `id`, `name`, `url`, `method`, `statusCode`, `result` |
| `monika_request_status_code_info` | Gauge | Collect HTTP status code of the probe request | `id`, `name`, `url`, `method` |

## Repeat

Expand Down
27 changes: 26 additions & 1 deletion src/components/config/get.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
* SOFTWARE. *
**********************************************************************************/

import { randomUUID } from 'node:crypto'
import { getContext } from '../../context'
import type { Config } from '../../interfaces/config'
import { log } from '../../utils/pino'
Expand All @@ -41,7 +42,10 @@ export async function getRawConfig(): Promise<Config> {
return addDefaultNotifications(config)
}

return config
// Add default alerts for Probe not Accessible
const finalizedConfig = addDefaultAlerts(config)

return finalizedConfig
}

// mergeConfigs merges configs by overwriting each other
Expand Down Expand Up @@ -82,6 +86,27 @@ async function parseNativeConfig(): Promise<Config[]> {
)
}

export const FAILED_REQUEST_ASSERTION = {
assertion: '',
message: 'Probe not accessible',
}

function addDefaultAlerts(config: Config) {
return {
...config,
probes: config.probes.map((probe) => ({
...probe,
alerts: [
...(probe.alerts || []),
{
id: randomUUID(),
...FAILED_REQUEST_ASSERTION,
},
],
})),
}
}

async function parseNonNativeConfig(): Promise<Config | undefined> {
const { flags } = getContext()
const hasNonNativeConfig =
Expand Down
12 changes: 12 additions & 0 deletions src/components/probe/prober/http/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ export class HTTPProber extends BaseProber {
response,
})

getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex,
status: 'up',
})

this.logMessage(
true,
getProbeResultMessage({
Expand Down Expand Up @@ -226,6 +232,12 @@ export class HTTPProber extends BaseProber {
}
const alertId = getAlertID(url, validation, probeID)

getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex,
status: 'down',
})

getEventEmitter().emit(events.probe.alert.triggered, {
probe: this.probeConfig,
requestIndex,
Expand Down
6 changes: 6 additions & 0 deletions src/components/probe/prober/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ export abstract class BaseProber implements Prober {

// this probe is definitely in incident state because of fail assertion, so send notification, etc.
this.handleFailedProbe(probeResults)

return
}

Expand All @@ -148,6 +149,11 @@ export abstract class BaseProber implements Prober {
requestIndex: index,
response: requestResponse,
})
getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex: index,
status: 'up',
})
logResponseTime(requestResponse.responseTime)

if (
Expand Down
6 changes: 6 additions & 0 deletions src/events/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ export default {
sanitized: 'CONFIG_SANITIZED',
updated: 'CONFIG_UPDATED',
},
notifications: {
sent: 'NOTIFICATIONS_SENT',
},
probe: {
alert: {
triggered: 'PROBE_ALERT_TRIGGERED',
Expand All @@ -46,5 +49,8 @@ export default {
notification: {
willSend: 'PROBE_NOTIFICATION_WILL_SEND',
},
status: {
changed: 'PROBE_STATUS_CHANGED',
},
},
}
2 changes: 2 additions & 0 deletions src/loaders/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ function initPrometheus(prometheusPort: number) {
decrementProbeRunningTotal,
incrementProbeRunningTotal,
resetProbeRunningTotal,
collectProbeStatus,
} = new PrometheusCollector()

// collect prometheus metrics
Expand All @@ -93,6 +94,7 @@ function initPrometheus(prometheusPort: number) {
eventEmitter.on(events.probe.ran, incrementProbeRunningTotal)
eventEmitter.on(events.probe.finished, decrementProbeRunningTotal)
eventEmitter.on(events.config.updated, resetProbeRunningTotal)
eventEmitter.on(events.probe.status.changed, collectProbeStatus)

startPrometheusMetricsServer(prometheusPort)
}
Loading
Loading