check_netapp_shelfenv (Shelf-Environment)

Checks the shelf-status and various shelf-specific metrics and states on a NetApp-filer.

Description

This plugin checks status and collects metrics of shelves and their environment (temperature, cooling-devices, power-supplies, voltage-sensors, current-sensors).

Most of the checks do not receive any thresholds but rely on the ones set in DataONTAP. (Exception for temperature - see Advanced Examples below.)

The pattern given to the --exclude|-X and --include|-I parameters is matched against the instances full-name (e.g. ‘SBXTEST-01 channel0a shelf0 temp1’). This enables the checking of specific channels or shelfs but also single elements. See also the section Advanced Examples.

Examples

Simple Examples

$ check_netapp_shelfenv shelf-status
NETAPP SHELFENVIRONMENT OK - 1 shelf checked
SBXTEST-01 channel0a shelf0: normal

Checks all shelfs. Returns CRITICAL if the status is not ‘normal’


$ check_netapp_shelfenv temp
NETAPP SHELFENVIRONMENT OK - 7 temperature-sensors checked
SBXTEST-01 channel0a shelf0 temp1: ok normal_temperature_range(26°C)
SBXTEST-01 channel0a shelf0 temp2: ok normal_temperature_range(34°C)
SBXTEST-01 channel0a shelf0 temp3: ok normal_temperature_range(30°C)
[...]
| SBXTEST-01_channel0a_shelf0_temp1=26°C;;;; SXJTEST-01_channel0a_shelf0_temp2=34°C;;;; SBXTEST-01_channel0a_shelf0_temp3=30°C;;;; [...]

Checks the temperature-sensors in all shelfs. Returns CRITICAL if one or more sensors report an error.


$ check_netapp_shelfenv temp --perfdata_uom_string=empty
NETAPP SHELFENVIRONMENT OK - 7 temperature-sensors checked
[...]
|  | SBXTEST-01_channel0a_shelf0_temp1=26;;;; SBXTEST-01_channel0a_shelf0_temp2=34;;;; SBXTEST-01_channel0a_shelf0_temp3=30;;;; [...]

Same as above but frees the perf-datas uom from the potential trouble-maker ‘°C’ (degree-symbol).


$ check_netapp_shelfenv fan
NETAPP SHELFENVIRONMENT OK - 4 fans checked
SBXTEST-01 channel0a shelf0 fan1: ok (2970rpm)
SBXTEST-01 channel0a shelf0 fan2: ok (3000rpm)
SBXTEST-01 channel0a shelf0 fan3: ok (3000rpm)
SBXTEST-01 channel0a shelf0 fan4: ok (3000rpm)
 | SBXTEST-01_channel0a_shelf0_fan1=2970rpm;;;0; SBXTEST-01_channel0a_shelf0_fan2=3000rpm;;;0; [...]

Checks all cooling-fans in all shelves.


$ check_netapp_shelfenv psu
NETAPP SHELFENVIRONMENT OK - 2 power-supplies checked
SBXTEST-01 channel0a shelf0 psu1(type: 9C): ok
SBXTEST-01 channel0a shelf0 psu2(type: 9C): ok

Checks all power-supplies in all shelves.


$ check_netapp_shelfenv voltage
NETAPP SHELFENVIRONMENT OK - 4 voltage-sensors checked
SBXTEST-01 channel0a shelf0 volt1: ok normal_operating_range (5.70V)
SBXTEST-01 channel0a shelf0 volt2: ok normal_operating_range (12.300V)
SBXTEST-01 channel0a shelf0 volt3: ok normal_operating_range (5.70V)
SBXTEST-01 channel0a shelf0 volt4: ok normal_operating_range (12.180V)
| SBXTEST-01_channel0a_shelf0_volt1=5.70V;;;; SBXTEST-01_channel0a_shelf0_volt2=12.300V;;;; [...]

Checks all voltage-sensors in all shelves

Consider to set --perfdata_uom_string=empty if the ‘V’ (Volts) uom confuses your monitoring-systems graphing engine.


$ check_netapp_shelfenv current
NETAPP SHELFENVIRONMENT OK - 4 current-sensors checked
SBXTEST-01 channel0a shelf0 current1: ok normal_operating_range (4.29A)
SBXTEST-01 channel0a shelf0 current2: ok normal_operating_range (5.58A)
SBXTEST-01 channel0a shelf0 current3: ok normal_operating_range (4.57A)
SBXTEST-01 channel0a shelf0 current4: ok normal_operating_range (0A)
 | SBXTEST-01_channel0a_shelf0_current1=4.29A;;;; SBXTEST-01_channel0a_shelf0_current2=5.58A;;;;  [...]

Checks all current-sensors in all shelves.

Consider to set --perfdata_uom_string=empty if the ‘A’ (Ampere) uom confuses your monitoring-systems graphing engine.


$ check_netapp_shelfenv coin-battery
NETAPP SHELFENVIRONMENT OK - 2 coin-batteries checked
SBXTEST-01.shelf21.1.coin-battery2, status: normal\E'
SBXTEST-01.shelf21.1.coin-battery1, status: normal\E'
 

Checks all coin-batteries status in all shelves.


Advanced Examples

Including and Excluding Instances

$ check_netapp_shelfenv cool
NETAPP SHELFENVIRONMENT OK - 4 cooling-elements checked.
TOASTER-01 channel0a shelf0 cool1: ok (2970rpm)
TOASTER-01 channel0a shelf0 cool2: ok (3000rpm)
TOASTER-01 channel0a shelf1 cool1: ok (2940rpm)
TOASTER-01 channel0a shelf1 cool2: ok (3000rpm)

Checks all cooling-elements in all shelves on the cluster


$ check_netapp_shelfenv cool -X cool2$
NETAPP SHELFENVIRONMENT OK - 2 cooling-elements checked.
TOASTER-01 channel0a shelf0 cool1: ok (2970rpm)
TOASTER-01 channel0a shelf1 cool1: ok (2940rpm)

Excludes any element whose name contains ‘cool2’.

$ check_netapp_shelfenv cool -X "shelf0\ cool2$"
NETAPP SHELFENVIRONMENT OK - 3 cooling-elements checked.
TOASTER-01 channel0a shelf0 cool1: ok (2970rpm)
TOASTER-01 channel0a shelf1 cool1: ok (2940rpm)
TOASTER-01 channel0a shelf1 cool2: ok (3000rpm)

Excludes ‘cool2’ on shelf0 only.


$ check_netapp_shelfenv cool -I "shelf1\ cool"
NETAPP SHELFENVIRONMENT OK - 2 cooling-elements checked.
TOASTER-01 channel0a shelf1 cool1: ok (2940rpm)
TOASTER-01 channel0a shelf1 cool2: ok (3000rpm)

Checks only on shelf1.

Note the backslashed space after the shelfs name (otherwise also shelf10, shelf11 etc. would get checked.)!


Thresholds for the temperature

The temperature check allows to set thresholds which check in addition to the internal status.

$ check_netapp_shelfenv temp --warning=40
NETAPP SHELFENVIRONMENT WARNING - 7 temperature-sensors checked
SBXTEST-01 channel0a shelf0 temp1: ok normal_temperature_range(26°C)
SBXTEST-01 channel0a shelf0 temp2: warning normal_temperature_range(41°C)
SBXTEST-01 channel0a shelf0 temp3: ok normal_temperature_range(30°C)
[...]
| SBXTEST-01_channel0a_shelf0_temp1=26°C;;;; SXJTEST-01_channel0a_shelf0_temp2=41°C;;;; SBXTEST-01_channel0a_shelf0_temp3=30°C;;;; [...]

Checks the temperature-sensors in all shelfs. Returns CRITICAL if one or more temperature-sensors report an error, but in addition already returns a WARNING if one of them is over 40 degrees.


$ check_netapp_shelfenv temp --critical=50
NETAPP SHELFENVIRONMENT CRITICAL - 7 temperature-sensors checked
SBXTEST-01 channel0a shelf0 temp1: critical normal_temperature_range(52°C)
SBXTEST-01 channel0a shelf0 temp2: ok normal_temperature_range(34°C)
SBXTEST-01 channel0a shelf0 temp3: ok normal_temperature_range(30°C)
[...]
| SBXTEST-01_channel0a_shelf0_temp1=52°C;;;; SXJTEST-01_channel0a_shelf0_temp2=34°C;;;; SBXTEST-01_channel0a_shelf0_temp3=30°C;;;; [...]

Checks the temperature-sensors in all shelfs. Returns CRITICAL if one or more sensors report an error, or if one of them is over 50 degrees