-
Notifications
You must be signed in to change notification settings - Fork 43
Monitoring Graph Thresholds
You might want to monitor your graphs, since Graphite can output data in JSON format this is pretty easy to add.
NOTE: This feature is new and I am not quite 100% sure how it might work in practice feedback is appreciated and please expect there to be some big changes potentially happening wrt this.
Ideally you would Record Monitoring Thresholds right into your graphs, in this case the check script can automatically figure out the monitoring to apply but you can also supply thresholds on the command line that would supply of override graph ones.
The graph below shows Load Average for a machine and has Warning and Critical thresholds visible
title "Load Average"
hide_legend true
field :iowait, :data => "keepLastValue(exmple.munin.load.load)",
:color => "red",
:alias => "Load Average"
critical :value => 0.3
warning :value => 0.1
Running a Nagios check against this graph yields:
$ check_graph.rb --graphite "http://graphite.my.net/render/" --graph monitor.graph
WARNING - Load Average 0.13 >= 0.1
This checks the past 3 data points in the graph and compare so we're only finding the peak at the right of the graph, you can step further back in time:
$ check_graph.rb --graphite "http://graphite.my.net/render/" --graph monitor.graph --check 100
WARNING - Load Average 0.22 >= 0.1
Here we went far enough in the past to hit the peak around 15:00.
If your graph has no thresholds defined or you simply want to override the ones in a graph you can do that on the CLI too:
$ check_graph.rb --graphite "http://graphite.my.net/render/" --graph monitor.graph --warn 0.2
OK - All data within expected ranges
This supplies or overrides the graph supplied thresholds. You can specify --warn and --crit more than once to define bands.