Release History

7.0.1 Fixed Head Check

Released 2022-11-17

Fixed

  • check_netapp_pro Head -o version can now also process patched versions

7.0.0 Check NetApp Pro gets Check NetApp-ZAPI

Released 2022-02-16

The universal interface introduced in v6.0.0 has been dropped in favor of technically separated products where each of the products has its own user interface and documentation. All of the mentioned technical products are still part of Check NetAppPRO and are covered by its license at no additional cost.

The installation as docker/podman image has been dropped.

Changed

New

  • The PerfNic Check has got more counters it can check. E.g.

    • link_up_to_downs (Number of link state change from UP to DOWN)
    • tx_ and rx_discards (Discarded frames)
    • rx_errs (Total number of received error packets)
  • DiscCount accepts the SSD-NVM disk type

Fixed

  • UsageTrend: “Illegal division by zero” when size- / files-total is 0 or undef is caught. The instance is skipped with a message.
  • check_netapp_scrub.pl: catch undefined values

Removed

  • check_netapp_process.pl is no longer supported and has been removed from the examples cfg due to security concerns.

6.0.0 Universal Getter and Check, Dependency Free Installation

Released 2020-10-06

New

  • A binary universal collector (get_netapp) and check (check_netapp) chooses the right binary or script to get data and check results.

    Motivation: With supporting 7-Mode, Cluster-Mode ZAPI and Cluster-Mode RESTful filers respective APIs now, finding the right getter and check got really confusing. By providing a universal User-Interface with a single starting point for every case we help especially new users setting up the configuration.

  • All Perl-scripts and -modules have been packed inside of a ready to use docker image making both installation and upgrades a snap.

  • New PerfNic check to monitor for crc-errors and other counters of the physical network adapters.

  • New switch --consider_latest_only for the SnapCenter-check.

Fixed

  • The script prove_installation.pl returned a false positive result on some installations and has been removed.
  • The check AutosizeMode returns no warnings anymore and can now check for the grow mode instead of the not existing shrink mode.
  • Shelf-environment getter stops on some hardware. (Message: Can’t call method “children_get” on an undefined value at ./get_netapp_shelfenv.pl). We provide a new switch --skip_empty_channel_lists as workaround.

Known Issues

  • Some getter still read the port from the authfile. Make sure to remove any port entry from the authfile so that the port can be chosen depending on the protocol (https with port 443 is the default, --no-ssl switches to http with port 80).

  • The volume getter may not return the cluster-node vol0.

5.3.0 New Autogrow-Check (EMS) and aggregate for RESTful API

Released 2020-03-10

New

  • New check_netapp_ems (together with get_netapp_ems) allows to detect an overly high rate of autogrow- or other events. Both getter and check work only with DataONTAP 9.6 or later.
  • get_netapp, the new RESTful getter, can now retrieve aggregate data too. This allows to use the Usage-check with the aggregate object and other aggregate-checks to work based on NetApps new RESTful API.

5.2.1 Fixes for Star-Setup and Missing rpmbuild/SPECS

Released 2020-03-02

This release contains an important fix for the cdot getter get_netapp_cm.pl if used in a star-setup and/or with FlexGroup Volumes.

Fixed

  • New switch --skip_duplicates in get_netapp_cm.pl which allows to collect data in a star-setup with FlexGroup Volumes. See also this article in the blog for details.
  • Although stated otherwise in the docs, the rpmbuild/SPECS file was missing. This file is available now.

5.2.0 Fixes for RESTful check_netapp and Regex-Matching in SnapCenter Check

Released 2020-02-14

This release contains a few fixes and improvements regarding the new RESTful collector get_netapp plus enhanced filters for check_netapp_snapcenter.

Fixed

  • Add the missing transfiles for the RESTful collector get_netapp
  • Improve the error message shown in case of missing capabilities. (The new API used by get_netapp requires extra capabilities configured in ONTAP. We already had an example-typescript in the manual - the improved error message should help to find it.)

Improved

  • The resource-group- and policy-filters in check_netapp_snapcenter understand regex-pattern as well as strings now. See the article on the blog for details.

5.1.0 New features and options and a new certificate-check

Released 2020-02-01

This release contains a few improvements which did not get it into the 5.0.0 release plus a new experimental check for server-certificates.

Improved

  • The getter for the shelfbay-object has a new switch --ignore_empty_portlist.
  • check_netapp_snapcenter has got a new option to send an alarm, if specified jobs do not show up at all in the logs: --missing=WARNING|CRITICAL

New

  • Certificate check: checks for expiring server-certificates

5.0.0 RPM-SPECS, new checks and a RESTful binary Collector

Released 2020-01-23

Improved

  • New directory layout making it a bit easier to build packages (RPM). A spec-file for rpmbuild is included as well. Please read the documentation before upgrading!
  • check_netapp_quotas: Lists the usernames of overly used quotas. Plus the new switch --list_users changes the view completely, so that soft- and hard-limits per user are printed in a human-readable manner.
  • The rm_ack feature can now be switched off completely with --rm_ack=off. This goes a step further than --rm_ack=never. See our blog for more details.

New

  • UnprotectedVolume searches for volumes not protected by SnapMirror. Sends an alarm if at least one unprotected volume is found. Using --include=<regex> and --exclude=<regex> specific or groups of volumes and vservers can be checked.
  • VolumeAge warns about overly old or exceptional young volumes.
  • SnapCenter check: check_netapp_snapcenter alarms if errors occur during backups. This is our first check we deliver as a precompiled and therefore fast and easy to install binary.
  • EXPERIMENTAL: A completely new Collector, get_netapp is included to collect data from NetApp systems using their new RESTful API. This collector is provided as well as single, dependency-free precompiled binary. Give it a try, if you have a filer with DataONTAP 9.6 or higher at hand. (But consult the docs beforehand!)

Changed

  • The quiet old check_netapp_spare has been removed due to security issues (the required ZAPI needs write access). This check can easily be replaced with the more up to date DiskCount check.
  • The universal check_netapp_anycli has been deprecated and will be removed with the next version due to security considerations.
  • The collector get_netapp_cm.pl (as well as any new collectors like get_netapp) require a license file in place. Please consult the chapter License File in the documentation.

4.3.0 NTP-check, updated disk-performance-object, other improvements

Released 2019-06-19

New

  • check_netapp_time: Check for a valid ntp-server configuration on the filer and alarm if the drift between the filers system-time and the monitoring-server is getting too high.
  • The Usage check gets a new parameter --aggr_over=<percentage> to alarm only if aggregates are both overcommitted and begin to fill up.
  • Overall checks like Usage have a new parameter --top=<n>. Setting --top to a number will reduce the number of instances, e.g. volumes shown in the output to that number. This has been requested to reduce the sometimes very long lists of volumes, disks and the like.

Improved

  • PerfDisk got an additional performance object disk_constituent. Using disk_constituent (instead of disk) avoids getting Multiple_Values returned instead of valid instance names. If this new object is used for the check the getters object must be changed as well!

4.2.0 New checks and a flexible way to remove Service Acknowledgements

Released 2019-03-11

New

  • AutosizeMode checks if the auto-size mode of all volumes matches a given preset (grow, grow_shrink).
  • Head has a new object version. This both shows and monitors the DataONTAP version of the nodes. Plus it prints the box-model onto the first line so that it can be quickly searched in the GUI.
  • SisStatus flags volumes, whose compression and deduplication is not enabled.
  • Service Acknowledgements can be removed by a custom-script using the new parameter --rm_ack_handler=<handler-script>.
  • check_netapp_unused_lun searches and flags LUNs which are mapped but whose initiators are not logged in.
  • DiskPathQuality checks disk path qualities, reports i/o-error percentages and raises a CRITICAL error whenever an error percentage is above zero.

Improved

  • Completely rewritten example configurations. The etc/*.cfg-files are taken as they are from a running monitoring-system in our lab. So typos or outdated left-overs are almost impossible from now on.

4.1.0 Remove --node from check_netapp_cluster, fix and improved documentation

Released 2018-12-06

Fixed

  • The storefiles written by the getters from the 4.0.x releases were several times bigger than the ones from older versions. We have fixed that now.
  • The input-values of --object is validated in check_netapp_snapshots.
  • VolumeAutosize failed on MetroClusters because of undefined values. These volumes are skipped now.

Changed

  • Remove --node from check_netapp_cluster: The check_netapp_cluster is divided into check_netapp7_cluster for legacy 7-Mode systems and the cdot-logic for checking the cf-status has been incorporated into check_netapp_takeover.
  • Checks using system-cli have got an improved documentation.

4.0.2 Fixes

Released 2018-11-07

Fixed

  • Updated the cpanfile and the chapter on Required Perl Modules in the installation document.
  • The parameter [--max_records]{.title-ref} is evaluated (this speeds up getters which pull a large number of instances.)

4.0.1 Refurbishment of the whole suite

Released 2018-10-23

Fixed

  • The getters --stm has been reenabled (in 4.0.0 it was silently ignored which made UsageTrend failing).

4.0.0 Refurbishment of the whole suite

Released 2018-10-15

As the 3.x versions of check_netapp_pro has got now over 3 years old it is time for some changes in both the libraries and the user interface.

So we take this major release as an opportunity to remove deprecated arguments and introduce new ones. This will require some changes in the configuration (something we avoid whenever possible in non-major releases)!

Fixed

  • ShelfEnvironment --what=voltage failed (exit 255) in case of a PSU error. Failed PSUs are reported as CRITICAL now.

Changed

  • Completely rewrote the cluster-mode getter to make it more efficient and better maintainable. Several objects have now their own dedicated getter (disk, snapshot, ...).
  • Introduce the --support switch as a replacement for --mammamia. (This applies to all getters with the exception of 7m and perfdata.)
  • Removed --vserver|--node|-s parameter from mosts checks. Now these checks have the node- or vserver-name prefixed to every instance-name. So instead of the old --node=my-node one can filter now with --include=^my-node. This also obsoletes the --svm_in_name switch, we had implemented in various checks. Also this switch has been removed since that behavior is the default now.
  • The call-files are grouped into subfolders per host now. This change will not be visible on sites which start with an empty storedir. (Background: This change reduces the number of files per directory to avoid negative implications on file-system performance.) Sites which want to keep their existing (long-term)-stores must run call2dir.pl once.

Minor Changes

  • The default of Raidstatus --ok_state includes also mixed_raid_type, hybrid.

New and improved

  • New PerfQtree which can monitor some ops-counters on the q-tree level.
  • Enhanced check_netapp_takeover: New option --ic_check checks the interconnect-links and reports how many of them are up or down.
  • ShelfEnvironment can have thresholds for the temperature-object in addition to the sensor-state check.
  • Snapshots-check: Add additional filter --check_only=with(out)_lun

3.10.3 Fix several direct-check for 7m backwards-compatibility

Released 2018-04-23

Fixed

The following checks are backwards compatible with 7m again:

  • check_netapp_cluster
  • check_netapp_health
  • check_netapp_license
  • check_netapp_scrub
  • check_netapp_spare

Plus improved the help-text in many other checks.

3.10.2 Important Bugfix for ServiceProcessor, Update-Mode for Getters, Grafana-Compatibility

Released 2018-03-06

New and improved

  • Remove --svm_in_name switch.
  • Getters are getting smarter with the new update mode (e.g. --update=3min )
  • New --perf_format=grafana: This allows perf-data also for status-checks which is important to visualize them in Grafana.
  • New check_netapp_unused_lun: Checks for luns which are online but do not have an initiator connected.
  • PerfSysNode has got a new parameter: --math=average|total.
  • UsageTrend: Additional parameter --reduced_history=OK|WARNING|CRITICAL|UNKNOWN allows to specify at which level a reduced-history condition is communicated. Defaults to WARNING which is the behavior we had hardcoded so far.
  • New check_netapp_process: Checks for runaway processes on a filer.

Fixed

3.10.1 Improvements f. UsageTrend, ASUP-check & LunAlignment, DataONTAP 9.3 compatible

Released 2018-02-13

New and improved

  • The checks and getters are compatible with the upcoming DataONTAP 9.3
  • Check LunAlignment has a new switch: --show_misaligned_luns.
  • Check UsageTrend supports two new filter- and display switches already known from other checks: --check_only=without_lun|with_lun|... and --svm_in_name.
  • The ASUP-log-check will now consider retransmitted messages and has got a --lookbehind filter to ignore messages which are too old.
  • UsageTrend handles missing data in a store more flexible (tries to calculate with reduced look-behind).
  • DiskPaths2 is a planed replacement for the deprecated DiskPaths check. DiskPaths2 has a different UI and is in an early alpha-state. Do not rely blindly on the config-sets delivered with this check!

Fixed

  • Check Job correctly shows the number of queued jobs in its perf-data. Also enhanced the --help and examples.

3.10.0 PerfSysNode, PerfAggregate, LunAlignment, ASUP-Check

Released 2017-12-07

New and improved

  • New check PerfSysNode (amends PerfSys for clusters with DataONTAP 8.3+)
  • New check LunAlignment
  • New check PerfAggregate
  • New check check_netapp_asup.pl monitors the autosupport-log for failed transmissions.
  • Much better documentation for the growth metric of OvercommitAggr
  • Add option --check_only=with_lun|without_lun to SnapshotLessVolume
  • Enhance UsageTrend: --what=inodes interpolates the usage of inodes and warns if the number of available inodes will be reached within a given time.
  • Enhance check_netapp_takeover: checks the metro-cluster configuration
  • Snapshots --older_than and --younger_than accepts also seconds, minutes and hours (plus days and weeks as in previous versions).

Fixed

  • Better error handling if --storedir points to a non-writable directory
  • Sis: Handling of volumes where an error occurs while fetching the status values
  • Head: Fixed several more and less cosmetic issues when one node is down
  • Disk: Fixed --what=non-zeroed-spare
  • UsageTrend: Fixed handling of offline or otherwise not available volumes.

3.9.2 ServiceProcessor Check

Released 2017-09-13

New and improved

Fixed

  • Usage: [--check_only]{.title-ref} and [--svm_in_name]{.title-ref} are compatible now
  • check_netapp7_vfiler handles [--check_network]{.title-ref} even if vfnet is empty
  • check_netapp_quotas: Option [--path]{.title-ref} filters by qtree-path now
  • Distribution: takeover- and quota-check are now distributed as part of the ADVANCED Bundle
  • Documentation: missing api aggr-scrub-list-info in [list_apis.pl --mode=7m]{.title-ref}

3.9.1 Fixes + new Takeover and Job-Check

Released 2017-08-30

Fixed

  • Improved the error message in case of too less data in perf-stores.
  • Fixed the error-message in case a filer does not return a counter-value.
  • Removed example from Snapshots which shows how to search with Snapshots for volumes without Snapshots, as this did not work as expected. Using Snapshots to check for snapshot-less volumes will result in false negatives! Existing customers should check their configuration and replace these service-checks with the new SnapshotLessVolume check.
  • Fixed a bug which stoped UsageTrend from working, if some instances have been created within the look-behind-period

New and improved

  • New: check_netapp_takeover - checks if the take-over facility is enabled and takeovers are possible
  • New check: SnapshotLessVolume - searches for volumes without a snapshot.
  • New Check Job - checks for failed jobs.
  • New switch [--svm_in_name]{.title-ref} for the Usage-check prefixes the volumes name with the vserver. This avoids confusing duplicates in the output.

3.9.0 Faster store-format, LunSize Check, --ignore_case Switch

Released 2017-07-11

Important

Due to a change in the store-format existing store-files must be converted (or deleted) before upgrading! Please read the installation instructions.

Fixed

  • --mammamia for 7m-getter did not write into storedir
  • provide --check_only=without_lun_7m for 7m-filers
  • check_netapp_anycli: Document switches --like_result and --unlike_result
  • get_netapp_shelfenvironment: Do not stop if one node is down
  • check_netapp7_vfiler.pl: Handle undefined vfiler-status
  • SnapMirrorState: Avoid UNKNOWN caused by no longer existing snap-mirrors
  • SnapMirrorState/Metric: Excluding all relations and setting --no_instances=OK is ok now

New and improved

  • New check: LunSize
  • Faster, leaner and more reliable store-format (existing store-files must be converted before upgrading! Please read the installation instructions.)
  • New switch: --ignore_case (ignore case for --exclude and --include patterns)

3.8.1 Quota check and workarounds

Released 2017-05-04

New and improved

  • New check: check_netapp_quotas.pl - monitors quotas on a cdot filer.
  • StorageUtilization output units can be controlled by --factor=ki|Mi|Gi|...
  • DiskPaths: Bugfix and additional switch --port_pattern_ok=AAAA | BBBB | ...
  • ShelfEnvironment False positives regarding the shelf-status solved with switch --workaround_for_279931
  • ShelfEnvironment: False positive for voltage-sensors in DataONTAP 9.1P1 (solved with temporary switch --DataONTAP_91P1)

3.8.0 FCPAdapter-Check, Usage --check_only=<range>, PerfVolume --math=average|total...

Released 2017-03-24

New and improved

  • New check: FCPAdapter checks fcp-adapters status (online, ...)
  • Usage-check: --check_only=500GiB..1TiB (ranges to specify different thresholds depending on the volumes total size)
  • New switch in PerfVolume: --math=average|total
  • New metric in Sis: changelog-used-percent
  • New switch --ignore=^temp__ in perfdata-getter - this is the default!
  • New switch --node|--vserver-switch for VolumeAutoSize check
  • New switch read_ops- and write_ops-counter for PerfDisk check.
  • New switch --exclude=<counter> in perfdata-getter. Required to exclude the newly added _ops-counters on older filers.
  • Specific cdot-getter for the ShelfEnvironment -Check; this new getter does not need the --node parameter any more.

Fixed

  • ClusterPeerHealth: Returns "UNKNOWN - No matching instances found" instead of "no data" if no peer has been found

3.7.1 USABILITY IMPROVEMENTS and FIXES

Released 2016-12-16

New and improved

  • Performance and History-Checks: Improved usability by setting a mostly higher --tolerance and printing detailed instructions in case of a misconfigured getter.

    • The default for --tolerance is now always 50% of the --delta
    • The checks analyze the existing store-data and print any applicable solution to stdout.

Fixed

  • Snapshots: --show_problems works now as documented
  • Sis: handles sis-disabled volumes (skips them if --check_disabled is not set)
  • PerfDisk: tolerance was missing

3.7.0 MULTI-STORE, MULTI-TENANCY CHECKS, SIS CHECK

Released 2016-11-08

New and improved

  • All getters can write to more than one storedir. This allows to check from more than one peer/poller. ATTENTION 1: You need either to start with an empty storedir or migrate existing stores. (See also section Upgrading to Version 3.6.x and 3.7.x in the docs.) ATTENTION 2: Storedirs do not get created automatically any more. Add --autocreate_storedir to the getters to get back to the old behavior.
  • Direct checks with overall-logic (e.g. check_netapp7_fcpstats, check_netapp_scrub) can write their instance_results into more than one storedir.
  • New Multi-Tenancy Checks: ReportSpace, ReportIOPS
  • New Check Sis - check for dedup-problems (stale-fingerprints, run-time)
  • New check check_netapp_scrub to alarm if last scrubs timestamp is over a certain age.
  • Disk check has thresholds now - get a WARNING for 1 failed disk and a CRITICAL alarm for 2 or more.
  • --max_records can be configured to reduce the time to collect data (volume, vol_snapshot, disk) from cdot-filers
  • Snapshots can check the size of single snapshots now (--single).

3.6.0 NEW STORE-FORMAT

Released 2016-09-03

New and improved

  • A new format and logic for some store-files dramatically reduces the memory-footprint of checks using performance- and history-stores.
  • New check_netapp_license.pl
  • New switch --vserver for LunState
  • New switch --prefix for PerfLif
  • New switch --uninitialized_relation_is_ok for SnapMirrorMetrics (Useful if you setup a far away vserver peer and activated protection with snapmirror - synchronization can take a long time than.)
  • New switch --exclude_dependency=vclone for the Snapshot check to filter on application-dependency
  • Hint in output if instances have been excluded
  • Snapshots-check: --node|--vserver is optional now

Fixed

  • ShelfEnvironment fixed to check 7m MetroCluster

3.5.2 REACTIVATE max_length_multi_line

Released 2016-07-06

New and improved

  • Length of output can be limited by setting max_length_multi_line in check_netapp_pro.pl

3.5.1 --average IMPLEMENTED into PERFDISK

Released 2016-06-12

New and improved

  • --average implemented into PerfDisk

3.5.0 CLUSTER-PEER-HEALTH, NO INSTANCES CAN BE OK NOW!

Released 2016-06-08

New and improved

  • New check ClusterPeerHealth
  • New check MetroClusterVserver
  • New option for the collector-checks: --no_store=OK|... (most useful together with the collectors option --no_instances=OK|...)
  • DiskPaths: unassigned disks are skipped
  • VolumeState: Node-filter --exclude_nodes/--include_nodes introduced

3.4.2 FIXES USAGE/SHELF-ENVIRONMENT, DISKCOUNT, WORKLOAD, LUNSIZE

Released 2016-04-21

Fixed

  • Usage-Check: Handling of volumes with no state-attribute
  • shelf-environment-object in cm-getter needs a node-name (sometimes)
  • snap-mirror getter for 7-mode: Handle empty config (no snap-mirrors)
  • Respect the options --timeout and --hide_uuid

New and improved

  • new DiskCount check (counts spare-disks per type or storage-pool)
  • new workload-object in the perf-getter (in preparation for a cdot total-ops check)
  • extended lun-object in the 7m/cm-getters (in preparation for a LunSize check)

3.4.1 SWITCH --VSERVER|--NODE OPTIONAL FOR USAGE-CHECK, FIXES, DOCUMENTATION

Released 2016-03-07

Fixed

  • Usage check: unknown volumes are skipped now
  • Snapshots: target type not properly shown in case of singular

New and improved

  • Switch --vserver|--node is now optional for the Usage check
  • Documentation and examples

3.4.0 SHELF-CHECKS, UNZEROED-SPAREDISKS, UNBALANCED DISKS, METROCLUSTER-CHECKS

Released 2016-02-25

Fixed

  • Vserver checks operational-state (instead of admin-state) if DataONTAP >= 8.3

New and improved

  • New: ShelfEnvironment checks a Shelfs temperature, power-supplies and sensors
  • New: Check for non-zeroed spare disks with the Disk check
  • New check BadlyPerformingDisks for detecting unbalanced disks usages.
  • New check_netapp_mc_config.pl for checking a Metroclusters mode and configuration.
  • New check_netapp_anycli.pl for building checks with simple CLI-commands.
  • New option --alarm_limit (signals critical conditions as warning)
  • check_netapp7_snapvault.pl can check the lag-time
  • New option --display_factor for check_netapp7_snapvault.pl
  • New option --skip_reason in Vserver check
  • New option --avg in PerfCpu (for getting the averaged usage)

3.3.1 FABRIC METRO CLUSTER SUPPORTED, USER-FRIENDLY TIMES, FIXES

Released 2015-12-01

New and improved

  • DiskPaths can check Fabric Metro Clusters with 8 disk-paths
  • User-friendly time-values in addition to seconds: h, min, d(ay), (w)eek E.g. -w 3 -c 72 --factor=h instead of -w 10800 -c 259200
  • New Option: --perfdata_uom_string to replace correct but not accepted units like '/s' with whatever you want. E.g. --perfdata_uom_string=s|persec|empty|°C|µs|us|...
  • IfGrp lists ports which are down
  • NetPort/NetPort7m: new switch --ignore_disabled
  • Getter for shelf-bay objects can retrieve data from a --single_system

Documentation

  • Handling of Error "Magic number checking on storable file failed"

Fixes

  • NetPort handles undefined link-status (sends an alarm)
  • Usage of check_netapp7_vfiler.pl mentioned wrong switch

3.3.0 MORE LEGACY CHECKS, DISK INVENTORY INFO, CHECK for FCP CRC-ERRORS

Released 2015-11-04

New and improved

  • SnapMirror has a getter for 7-mode
  • New Check for 7-mode: check_netapp7_snapvault
  • New Check for 7-mode: check_netapp7_vfiler
  • New Check: check_netapp7_fcpstats: Checks for fcp crc-errors (7mode only)
  • Disk shows inventory info (so actions can be taken immediately)
  • New Switch to hide the lengthy UUID in output
  • New: Snapshots can now filter on the snap-mirror-label
  • Additional counters (evicts, disk_reads_replaced) in the FlashCash-Check
  • Preparation for DiskPath: path_use_state gets collected, required for checking Fabric MetroClusters (FMCs)

Fixes

  • interface-getter/IfGrp check: handle interfaces with undefined mediatype or empty link-states
  • IfGrp check can handle vifs-in-vifs recursively
  • The labels in the perf-data are always uniq now

Documentation

  • New chapter: How to configure a cluster-mode read-only user for monitoring
  • Several improvements, mainly in the Trouble-Shooting section.

3.2.0 NEW CHECKS and SWITCHES, FIX f. DUPLICATE INSTANCES BUG, MICRODELTAS

Released 2015-08-27

New and improved

  • new check: PerfTcpIp
  • new check: NetInterface
  • new check: PerfHostadapter
  • new check: UsageTrend
  • new check: SyncMirror for cdot Metro Cluster
  • new switch: --no_instances=OK...
  • new switch for volume Usage: --check_only=with_lun|without_lun
  • alarm (UNKNOWN) in case of counter overflows
  • microdeltas (experimental feature)
  • check_netapp_spare.pl: Better (more human understandable) help and output

Fixes

  • duplicate perf-instances in cmode if the number of instances was > 150

3.1.1 LEGACY CHECKS, CHECK FOR BROKEN AND- UNASSIGNED DISKS

Released 2015-03-25

Fixes

  • IMPORTANT: broken disks are no longer skipped in cluster mode
  • disk getter handles unassigned disks

New and improved

  • Legacy Checks ported from 2.x to 3.x: check_netapp7_head.pl, check_netapp_health.pl
  • Filer can be checked for broken or unassigned disks

3.1.0 STABLE RELEASE

Released 2015-03-13

Fixes

  • some small fixes in the cfg-files
  • removed confusing default-thresholds in PerfVolume

New and improved

  • New switch --show_problems appends all non-ok instances to the outputs first line. Ideal for sending more comprehensive alarm messages to mobile-phones. (More details are in the blog)

3.0.18 RELEASE CANDIDATE II

Released 2015-02-25

Fixes

  • StorageUtilization returns 25 out of bounds
  • snapshot-getter can handle offline/restricted/busy volumes

New and improved

  • --no_relations_is_ok switch for SnapMirror checks
  • DiskPaths can check up to 4 paths
  • Documentation improvements

3.0.17 RELEASE CANDIDATE I

Released 2015-01-26

New and improved

  • VolumeAutosize for cm and DataONTAP 8.2.1
  • PerfLif for DataONTAP 8.2.1 or higher
  • New check Disk (replaces the deprecated DiskFailed)
  • New option-argument to switch of performance-data: --perf_data=suppress

Fixes

  • Lots of fixes and enhancements based on feedback from internal UA-tests
  • Lots of fixes and enhancements based on customer-feedback (thanks to all of you!)
  • Reduced ONTAPI requirement for all 7m checks (1.11 instead of 1.15)

Documentation

  • Much more complete configuration files (Nagios cfg) for 7m and cm
  • Getter tells user correct getter-objects if a unknown object is detected
  • Getter: Pseudo-object 'explore' lists all available objects (-o explore).

3.0.16 NETPORT7M, FIX in SHELFBAY, ENHANCED CLUSTER-HA-CHECK

Released 2014-10-15

  • Fixed: ShelfBay getter died with non-installed disks
  • New: NetPort7m checks interface-status on 7-mode filers
  • Enhanced: check_netapp_cluster checks 'is-enabled' and 'is-interconnect-up'
  • Updated: Installation Documentation

3.0.15 PERFORMANCE IMPROVEMENTS (VOLUME, VOLUME-SNAPSHOTS)

Released 2014-10-06

  • Improved: Reduced amount of data retrieved from filer (volume, vol_snapshot in cluster-mode)
  • Debug-output of perf-getter extended to show timestamps for both instance-name-and counter-value retrieval period.

3.0.14 LUNSTATE, STORAGEUTILIZATION

Released 2014-10-02

  • New check: LunState check for offline-LUNs
  • Now available in pro: StorageUtilization
  • Fix for problem on systems with a patched Carp-module causing the getter to stopped with "Do you need to predeclare croak?"

3.0.13 VSERVER, DiskFailed, SnapVault

Released 2014-09-16

  • Renamed: Vfiler has been renamed to Vserver.
  • Enhanced: DiskFailed can now check for offline-disk on 7m filers
  • SnapVault included into SnapMirror, relationship-type visible in --explore

3.0.12 VFILER, WAFL

Released 2014-08-21

  • New check: Vfiler monitors the state of data-vfilers.
  • New check: Wafl monitors WAFL-counters like consistency points per second.

3.0.11 SEVERAL FIXES AND DOCUMENTATION IMPROVEMENTS

Released 2014-08-14

  • Fixed NVRAM
  • Relaxed ONTAPI Requirement for 7-Mode
  • utf8 conversion of all libraries

3.0.10 OVERCOMMIT-AGGR, VOLUME-AUTOSIZE

Released 2014-07-14

  • New check: VolumeAutosize alarms if autosized volumes grow too fast
  • Improvement: --explore/ --discover
  • Fix: Removed dependency from antlers
  • Feature: OvercommitAggr provides additional growth metric

3.0.9 SNAPMIRROR

Released 2014-06-24

New Checks in this version

  • SnapMirrorMetrics - checks SnapMirrors lag-time, transfer-size, ...
  • SnapMirrorState - checks SnapMirrors health and mirror-state.

New Feature

  • SnapShots w/ new switch --comparison=lt find volumes with no snapshots

3.0.8 MORE PERF-CHECKS

Released 2014-06-03

New Checks in this version

  • BufferCache - checks several metrics of the system buffer cache (=system memory)
  • FlashCash - checks the PAM2-cards performance
  • LunLatency - checks a luns latency or operations/second

3.0.7 REDUCED DEPENDENCIES, PERFSYS, MAMMAMIA

Released 2014-06-02

Fixed

  • IfGrp was broken for cluster-mode. Added --node switch to allow selective checking of a specific node.

New Checks

  • PerfSys - checks system-wide perf-counters (transfer-rates, ops, ...)

Updated

  • Removed dependency from non-core module Const::Fast antlers (installation should be able without CPAN now)
  • Performance improved by using faster accessors
  • PerfIf and PerfCpu got a --node switch, to selectively check a specific node.

New Feature

  • --mammamia - setting one single switch writes several debug-infos into one, easy to compress and transfer directory. Now supported for *all* getter-scripts!

3.0.6 PERFVOLUME's --vserver, --mammamia

Released 2014-05-19

Fixed

  • PerfVolume -H <myfiler> --vserver=vserv01 checks only volumes of vserv01.

New Feature

  • --mammamia - setting one single switch writes several debug-infos into one, easy to transfer directory. Only supported for check_netapp_perfdata.pl yet. (Had erroneously been announced for 3.0.5 already)

3.0.5 VOLUME-PERFGETTER USES VSERVER NOT NODES

Released 2014-05-19

Fixed

  • check_netapp_perfdata.pl --volume uses vserver not node as container for the volumes.

3.0.4 SIMPLIFICATIONS AND NEW PERFCHECKS, DISKFAIL-CHECK, FIXES

Released 2014-05-15

Simplified

  • Performance-getter and performance-checks always have a 1:1 relation now. (E.g. PerfVolume requires just one getter and not two.)
  • Perfgetters --mode switch can be written as attribute into netapp_credentials (auth-file). See the example in etc.

Fixed

  • Usage for volumes, VolumeState handle down-nodes
  • ShelfBay --state is mandatory now
  • ShelfBay --state=disk false-negatives eliminated (may be to strict now - please report)

New Checks

  • PerfCpu - checks net-ports link-status on cm-filers. Replaces IfConfig.
  • NVRAM - checks NVRAMs performance.
  • PerfIf - checks network-interfaces performance.
  • DiskFailed - checks for failed disk.

3.0.3 SHELF-BAY/HEAD/UPTIME FIXED, NEW NET-PORT CHECK

Released 2014-05-07

Fixed

  • ShelfBay check found duplicate bays on FAS 2240
  • Head and Uptime handle rebooting nodes better, additional switch --vserver

New Checks

  • NetPort - checks net-ports link-status on cm-filers. Replaces IfConfig.

3.0.2_02 IFGRP for CLUSTER MODE (EXPERIMENTAL)

Released 2014-04-30

New Checks

**IfGrp** - supports cluster-mode. Use -o net-port for the getter.

:   (experimental - please report!)

3.0.2_01 EXPERIMENTAL BUGFIX + NEW PLUGINS

Released 2014-04-29

Fixed

  • Getter for shelf-bay fixed (experimental - please report!)

New Checks

  • VolumeState - checks for non-online volumes (configurable)
  • AggregateState - checks for non-online aggregates (configurable)
  • Raidstatus - checks the raid-status of aggregates

3.0.2 FIRST BETA - BUGFIXES

Released 2014-04-23

Fixed

  • debugging printed w/o verbose
  • usage in Snapshots
  • message in Heads

3.0.1_14 FOURTEENTH ALPHA - Heads

Released 2014-04-11

Implemented

  • Uptime - reappeared for cm. Checks the time since last reboot.
  • check_netapp_cluster - direct check (w/o collector) - checks the HA-status

3.0.1_13 THIRTEENTHS ALPHA - Heads

Released 2014-04-07

Implemented

  • Heads - reappeared for cm. Checks hardware and global health-state.

Fixes

Several bugs and UI-inconsistencies repaired.

3.0.1_12 TWELTH ALPHA - PerfDisk

Released 2014-04-02

Implemented

  • NEW: PerfDisk - checks disk_busy counter on 7m and cm.

3.0.1_10 TENTH ALPHA - several additional checks

Released 2014-03-24

Implemented

  • NEW: SyncMirror - checks the mirror-status on Metro Cluster aggregates
  • NEW: IfGrp - checks if an interface-group has enough active links (replaces IfConfig)
  • NEW: DiskPaths - checks if each disk has two paths (A/B, B/A)

3.0.1_06 SIXTH ALPHA - ShelfBay (Shelf- and Disk Port Status Monitoring)

Released 2014-03-11

Implemented

  • NEW: ShelfBay - checks the status of shelves and the disk-port status (e.g. BYP, disk bypass).
  • Enhanced help (--help) for all checks (examples).

3.0.1_05 FITH ALPHA - PerfVolume (Per Volume Latency) gets PRO

Released 2014-03-07

Implemented

  • NEW: PerfVolume - checks as many volumes as you want - no more 'Not enough memory to get instances'.
  • NEW: PerfVolume has additional counters for CIFS and FCP and checks both 7-mode and cluster-mode filers.
  • NEW: get_netapp_perfdata.pl lists all available counters from your system (try it out: get_netapp_perfdata.pl -H filer -o volume ... --explain=counters).

3.0.1_03 + 3.0.1_04 SECOND and THIRD ALPHA of combined PRO-Release (7M/CM)

Released 2014-02-28

Implemented

  • FIX: Authentication via credentials file fixed
  • NEW: check_netapp_spare - checks for spare-low-conditions (not enough spare-disks) on nodes and clusters (both 7m and cm)

3.0.1_02 SECOND ALPHA of combined PRO-Release (7M/CM)

Released 2014-02-02

Implemented

  • New architecture which supports both DataONTAP versions (7m, cluster-mode).
  • Aggregate snapshots in 7-mode

The following checks should work for both 7m and cm:

  • Usage (volumes, aggregates)
  • Snapshots (volume- and aggregate-snapshots for 7m, volume-snapshots for cm)

Missing

  • A check for Shelfs, which will replace together with a Head-check the old Hardware-check.
  • Head, Performance Checks

3.0.1_01 FIRST ALPHA of combined PRO-Release (7M/CM)

Used only internally.