The getters for performance-checks are a bit special since they have to collect some history,
before the collected data can be used by the checks (a check not finding
enough history will tell you so and exit with UNKNOWN). Also the
getters check_interval
must be configured in accordance with the
checks --delta
.
check_interval
is a directive of the monitoring-daemons configuration whereas --delta
is a parameter of the checks.
E.g. if PerfVolume is configured to check over a 5 minutes delta with a
tolerance of 3 minutes (--delta=300 --tolerance=180
) the
performance-getter for the volume-object must be run every 5 minutes
(check_interval 5
).
The checks parameters --delta
and --tolerance
are command-line
arguments for the check-script and mostly appear as $ARGn$ in a
commands.cfg
, whereas the interval of the getter is configured with the
monitoring-systems check_interval
directive typically in a
services.cfg
. Unless you've changed the interval_length directive
from the default value of 60, the number after check_interval
will
mean minutes.
If you are using distributed monitoring (e.g. op5 configured as a peer cluster) you will face the challenge to keep the store files in sync. This can be either done by keeping them on a common network-share or by implementing some rsync logic. We ask you to check our blog for updates on this topic.
These checks require a longer short-term-memory (history) into the past to be able to interpolate these historical trends into the future. Whenever you are using one of these checks, do not forget to set an appropriate value for the short-term-memory in the corresponding getter.
E.g. for --lookbehind=1d
in UsageTrend the volume- or aggregate-getter needs an equal long short-term-memory set (--stm=1d
or even better 25h).