check_netapp_volume missing (removed volumes)

Checks for (unintentionally) removed volumes.

Usage

$ check_netapp_volume missing -H <host> [...] [--help]

Description

The check_netapp_volume missing plugin notifies you or your team, if a volume has been removed from the system.

Background

NetApp has a “recovery queue” which is set to 12 hours by default. If the monitoring reports that a volume has been deleted, one will have a chance to recover that volume.

We recommend to rename a volume before deletion, eg. postfix it’s name with _delete. This way the optional --exclude=~_delete$ argument would skip such intentional deletions whereas any other deletion will result in an alarm.

Important Parameters

  • --include / --exclude: Include or exclude volumes based on their name. In the context of the missing subcommand exclusions for eg. clones and temporary volumes will likely make sense. These exclusions can be easily combined with the recommended --exclude=~_delete$ pattern from above, as multiple --exclude arguments are allowed.

  • --ignore-after: Sets a time duration (default 12h) after which a missing volume is not notified any more.

For all other parameters consult --help on the commandline.

Examples

Simple Examples

$ check_netapp_volume missing -H filer
NETAPP MISSING VOLUMES UNKNOWN - 0 volumes checked
no history data found, please check --ignore-after parameter and plugin call interval

This only happens during the very first run against a filer. The check cannot yet refer to any comparison values. This changes from the second run onwards …

$ check_netapp_volume missing -H filer
NETAPP MISSING VOLUMES OK - 5 volumes checked
vserv_b.vol2: available
vserv_b.vol1: available
vserv_a.vol1: available
vserv_b.vol0: available
vserv_a.vol0: available

$ check_netapp_volume missing -H filer
NETAPP MISSING VOLUMES CRITICAL - 5 volumes checked, 2 CRITICAL
vserv_b.vol2: missing (CRITICAL), last seen 5 days ago, UUID: 25f65bef-d944-11e9-aa8e-000c29ce3ea1
vserv_a.vol1: missing (CRITICAL), last seen 5 days ago, UUID: 180a95ce-d944-11e9-aa8e-000c29ce3ea1
vserv_a.vol0: available
vserv_b.vol1: available
vserv_b.vol0: available

Two volumes, vol2 on vserv_b and vol1 on vserv_a have been removed recently.


$ check_netapp_volume missing -H filer --ignore-after=3d
NETAPP MISSING VOLUMES OK - 3 volumes checked
vserv_b.vol0: available
vserv_a.vol0: available
vserv_b.vol1: available

The setting --ignore-after=3d obviously removes these two disks, deleted 5 days ago, from the check results.

Complex Examples

Exclusion of Renamed Volumes

NETAPP MISSING VOLUMES OK - 5 volumes checked
vserv_b.vol2: available
vserv_b.vol1: available
vserv_a.vol1: available
vserv_b.vol0: available
vserv_a.vol0_del: available

If a naming convention is set that media must be renamed before deletion, such intended deletions can be skipped using an exclusion:

$ check_netapp_volume missing -H filer --exclude=~_delete$
NETAPP MISSING VOLUMES OK - 4 volumes checked
vserv_a.vol1: available
vserv_b.vol0: available
vserv_b.vol2: available
vserv_b.vol1: available

Other Exclusions

$ check_netapp_volume missing -H sim96 --exclude=~^vserv_a\.
NETAPP MISSING VOLUMES OK - 3 volumes checked
vserv_b.vol0: available
vserv_b.vol2: available
vserv_b.vol1: available

Exclude (do not check) volumes from the vserv_a-vfiler.


$ check_netapp_volume missing -H sim96 --exclude=~^vserv_a\. --exclude=~_delete$
...

Exclusions of intentionally renamed volumes can be combined with other exclusions of course.


$ check_netapp_volume missing -H sim96 --include=~^vserv_a\.
NETAPP MISSING VOLUMES OK - 2 volumes checked
vserv_a.vol1: available
vserv_a.vol0: available

Check only volumes from the vserv_a-vfiler.