Checks the rate of specific events in the Event Management System log.
$ check_netapp_ems event-rate -H <host> [...] [--help]
This plugin reads the event log and counts the number of events within a given lookbehind-period.
A typical usage scenario is counting the number of autosize-events within the last hours. A high rate of such events could be interpreted as a sign for volumes getting too small.
--name Name of EMS events whose rate should get calculated. If omitted all events are counted. A string prefixed with a tilde (
~string) is matched like a regular expression. See examples below.
--lookbehind Time-period for calculating the rate of matching EMS events. Must be a positive integer followed by a time-unit: s(econd), min(ute), h(our), d(day), w(eek). Defaults to 1h.
--rate Rate used for presenting the result in the message and the thresholds. Can be
--critical: Thresholds for the rate. The threshold is written as a pure number without any unit. The thresholds unit is taken from the
--rate=per_second --warning=3 → warns if more than 3 events per second
--rate=per_week --warning=3 → warns if more than 3 events per week
This has changed since v1.1.0 of the plugins! Please check existing configurations from older versions.
For all other parameters consult
--help on the commandline.
The lookbehind-period starts from the latest, matching event. All matching events within this period are added and divided trough the periods number of seconds. This rate is then recalculated according to
--rate and finally displayed as events per time-unit in the checks output.
./check_netapp_ems event-rate -H sim96 Rate of EMS events during the last hour: 4.45/minute ...
A first, probably not very useful example. It just calculates the number of events (any event!) per minute within the last hour.
./$ check_netapp_ems event-rate -H filer --name=wafl.vol.autoSize.done Rate of wafl.vol.autoSize.done EMS events during the last hour: 0.01/minute ...
Monitors the number of wafl.vol.autoSize.done events.
./$ check_netapp_ems event-rate -H filer --name=wafl.vol.autoSize.done --rate=per_day Rate of wafl.vol.autoSize.done EMS events during the last hour: 14.40/day ...
Same as above but displays the rate as number of autosize-events per day.
The calculation is still based on the last hour (the default value for
--lookbehind.) See the next example on how to change that.
./$ check_netapp_ems event-rate -H filer --name=wafl.vol.autoSize.done --rate=per_day --lookbehind=1d Rate of wafl.vol.autoSize.done EMS events during the last 24 hours: 13.82/day ...
Using a regular expression (regex) allows to monitor similar but not exactly equal events. E.g. to monitor any raid event:
$ check_netapp_ems event-rate -H sim96 --name="~^raid\." --rate=per_hour Rate of ~^raid\. EMS events during the last hour: 157.27/hour
Using raid.rg.media_scrub will reduce that to counting media-scrub events only:
$ check_netapp_ems event-rate -H sim96 --name="~^raid\.rg\.media_scrub" --rate=per_hour Rate of ~^raid\.rg\.media_scrub EMS events during the last hour: 127.67/hour
--name to a string (no tilde in front), will count only events whose name equals exactly:
$ check_netapp_ems event-rate -H sim96 --name=raid.rg.media_scrub.done --rate=per_hour Rate of raid.rg.media_scrub.done EMS events during the last hour: 37.44/hour
^ sign in front of the expression, anchors it to the beginning of the text and assures that only events whose name starts with raid are counted. Omitting the
^ would make the regex match also a name like
somtext.raider.somthing which is probably not what you intended.
Also do not forget to escape regex-active characters like the dot (would match any character). Especially on the commandline you should also quote the whole regex as seen in the example above.