Checks the health state of the system and various aspects. The health-check ist actively trigged on the target system.
Consider performance impacts on the monitored device if the health check is run too often. You may choose a longer interval
between running the checks and disable retries.
This plugin monitors the:
A typical output would look like:
NETAPP ESERIES HEALTH CRITICAL - 16 health aspects checked, 1 CRITICAL
netapp04.missingVolumes: notCompleted (CRITICAL)
netapp04.integratedHealthCheck: ok
netapp04.dbSubRecordsValidation: ok
netapp04.melEventCheck: ok
netapp04.validPassword: ok
netapp04.failedDrivesPresent: ok
netapp04.exclusiveOperations: ok
netapp04.driveCheck: ok
netapp04.nvsramDisableCfwDownloads: ok
netapp04.hotSparesInUse: ok
netapp04.controllerStatusOptimal: ok
netapp04.volumeGroupsComplete: ok
netapp04.objectGraphSyncCheck: ok
netapp04.configurationDatabaseCheck: ok
netapp04.spmDatabaseVerification: ok
netapp04.storageDeviceAccessible: ok
The patterns given to the --exclude|-X
and --include|-I
parameters allow to check specific aspects only.
With --ok-status=<regex>
the status can be defined which should be considered ok (e.g. notCompleted). See also the section Examples.
$ ./check_eseries_health --host=netapp04
NETAPP ESERIES HEALTH CRITICAL - 16 health aspects checked, 1 CRITICAL
netapp04.missingVolumes: notCompleted (CRITICAL)
netapp04.integratedHealthCheck: ok
netapp04.dbSubRecordsValidation: ok
netapp04.melEventCheck: ok
netapp04.validPassword: ok
netapp04.failedDrivesPresent: ok
netapp04.exclusiveOperations: ok
netapp04.driveCheck: ok
netapp04.nvsramDisableCfwDownloads: ok
netapp04.hotSparesInUse: ok
netapp04.controllerStatusOptimal: ok
netapp04.volumeGroupsComplete: ok
netapp04.objectGraphSyncCheck: ok
netapp04.configurationDatabaseCheck: ok
netapp04.spmDatabaseVerification: ok
netapp04.storageDeviceAccessible: ok
Checks all aspects on netapp04. Will return CRITICAL if at least one of the sub-systems or the system have a state other than ok.
$ ./check_eseries_health --host=netapp04 --ok-status=^(ok|notCompleted)$
NETAPP ESERIES HEALTH OK - 16 health aspects checked
netapp04.missingVolumes: notCompleted
netapp04.integratedHealthCheck: ok
netapp04.dbSubRecordsValidation: ok
netapp04.melEventCheck: ok
netapp04.validPassword: ok
netapp04.failedDrivesPresent: ok
netapp04.exclusiveOperations: ok
netapp04.driveCheck: ok
netapp04.nvsramDisableCfwDownloads: ok
netapp04.hotSparesInUse: ok
netapp04.controllerStatusOptimal: ok
netapp04.volumeGroupsComplete: ok
netapp04.objectGraphSyncCheck: ok
netapp04.configurationDatabaseCheck: ok
netapp04.spmDatabaseVerification: ok
netapp04.storageDeviceAccessible: ok
Same as above but returns OK even with notCompleted checks.
$ ./check_eseries_health --host=netapp04 --alarm-limit=WARNING
NETAPP ESERIES HEALTH WARNING - 16 health aspects checked, 1 WARNING
netapp04.missingVolumes: notCompleted (WARNING)
netapp04.integratedHealthCheck: ok
netapp04.dbSubRecordsValidation: ok
netapp04.melEventCheck: ok
netapp04.validPassword: ok
netapp04.failedDrivesPresent: ok
netapp04.exclusiveOperations: ok
netapp04.driveCheck: ok
netapp04.nvsramDisableCfwDownloads: ok
netapp04.hotSparesInUse: ok
netapp04.controllerStatusOptimal: ok
netapp04.volumeGroupsComplete: ok
netapp04.objectGraphSyncCheck: ok
netapp04.configurationDatabaseCheck: ok
netapp04.spmDatabaseVerification: ok
netapp04.storageDeviceAccessible: ok
Do not send a CRITICAL but a WARNING only if at least one of the health aspects is not ok.