Release History
7.0.1 Fixed Head Check
Released 2022-11-17
Fixed
check_netapp_pro Head -o version
can now also process patched versions
7.0.0 Check NetApp Pro gets Check NetApp-ZAPI
Released 2022-02-16
The universal interface introduced in v6.0.0 has been dropped in favor of technically separated products where each of the products has its own user interface and documentation. All of the mentioned technical products are still part of Check NetAppPRO and are covered by its license at no additional cost.
The installation as docker/podman image has been dropped.
Changed
New
Fixed
- UsageTrend: “Illegal division by zero” when size- / files-total is 0 or undef is caught. The instance is skipped with a message.
check_netapp_scrub.pl
: catch undefined values
Removed
check_netapp_process.pl
is no longer supported and has been removed from the examples cfg due to security concerns.
6.0.0 Universal Getter and Check, Dependency Free Installation
Released 2020-10-06
New
-
A binary universal collector (get_netapp
) and check
(check_netapp
) chooses the right binary or script to get data and
check results.
Motivation: With supporting 7-Mode, Cluster-Mode ZAPI and
Cluster-Mode RESTful filers respective APIs now, finding the right getter
and check got really confusing. By providing a universal
User-Interface with a single starting point for every case we help
especially new users setting up the configuration.
-
All Perl-scripts and -modules have been packed inside of a ready to
use docker image making both installation and upgrades a snap.
-
New PerfNic check to monitor for crc-errors and other counters of the physical network adapters.
-
New switch --consider_latest_only
for the SnapCenter-check.
Fixed
- The script
prove_installation.pl
returned a false positive result
on some installations and has been removed.
- The check
AutosizeMode
returns no warnings anymore and can now check for the grow mode
instead of the not existing shrink mode.
- Shelf-environment getter stops on some hardware. (Message: Can’t call method “children_get” on an undefined value at ./get_netapp_shelfenv.pl). We provide a new switch
--skip_empty_channel_lists
as workaround.
Known Issues
-
Some getter still read the port from the authfile. Make sure to remove any port entry from the authfile so that the port can be chosen depending on the protocol (https with port 443 is the default, --no-ssl
switches to http with port 80).
-
The volume getter may not return the cluster-node vol0.
5.3.0 New Autogrow-Check (EMS) and aggregate for RESTful API
Released 2020-03-10
New
- New
check_netapp_ems
(together with get_netapp_ems
) allows to
detect an overly high rate of autogrow- or other events. Both getter
and check work only with DataONTAP 9.6 or later.
get_netapp
, the new RESTful getter, can now retrieve aggregate
data too. This allows to use the
Usage-check
with the aggregate object and other aggregate-checks to work based
on NetApps new RESTful API.
5.2.1 Fixes for Star-Setup and Missing rpmbuild/SPECS
Released 2020-03-02
This release contains an important fix for the cdot getter
get_netapp_cm.pl
if used in a star-setup and/or with FlexGroup
Volumes.
Fixed
- New switch
--skip_duplicates
in get_netapp_cm.pl which allows to
collect data in a star-setup with FlexGroup Volumes. See also this
article
in the blog for details.
- Although stated otherwise in the docs, the
rpmbuild/SPECS
file was
missing. This file is available now.
5.2.0 Fixes for RESTful check_netapp and Regex-Matching in SnapCenter Check
Released 2020-02-14
This release contains a few fixes and improvements regarding the new
RESTful collector get_netapp
plus enhanced filters for
check_netapp_snapcenter
.
Fixed
- Add the missing transfiles for the RESTful collector
get_netapp
- Improve the error message shown in case of missing capabilities.
(The new API used by
get_netapp
requires extra capabilities
configured in ONTAP. We already had an example-typescript in the
manual - the improved error message should help to find it.)
Improved
- The resource-group- and policy-filters in
check_netapp_snapcenter
understand regex-pattern as well as strings now. See the article on
the blog for details.
5.1.0 New features and options and a new certificate-check
Released 2020-02-01
This release contains a few improvements which did not get it into the
5.0.0 release plus a new experimental check for server-certificates.
Improved
- The getter for the shelfbay-object has a new switch
--ignore_empty_portlist
.
- check_netapp_snapcenter has got a new option to send an alarm, if
specified jobs do not show up at all in the logs:
--missing=WARNING|CRITICAL
New
- Certificate check: checks for expiring server-certificates
5.0.0 RPM-SPECS, new checks and a RESTful binary Collector
Released 2020-01-23
Improved
- New directory layout making it a bit easier to build packages (RPM).
A spec-file for rpmbuild is included as well. Please read the
documentation before upgrading!
- check_netapp_quotas: Lists the usernames of overly used
quotas. Plus the new switch
--list_users
changes the view
completely, so that soft- and hard-limits per user are printed in a
human-readable manner.
- The rm_ack feature can now be switched off completely with
--rm_ack=off
. This goes a step further than --rm_ack=never
. See
our blog for more details.
New
- UnprotectedVolume
searches for volumes not protected by SnapMirror. Sends an alarm
if at least one unprotected volume is found. Using
--include=<regex>
and --exclude=<regex>
specific or groups of
volumes and vservers can be checked.
- VolumeAge
warns about overly old or exceptional young volumes.
- SnapCenter check: check_netapp_snapcenter alarms if errors
occur during backups. This is our first check we deliver as a
precompiled and therefore fast and easy to install binary.
- EXPERIMENTAL: A completely new Collector, get_netapp is
included to collect data from NetApp systems using their new RESTful
API. This collector is provided as well as single, dependency-free
precompiled binary. Give it a try, if you have a filer with
DataONTAP 9.6 or higher at hand. (But consult the docs beforehand!)
Changed
- The quiet old check_netapp_spare has been removed due to
security issues (the required ZAPI needs write access). This check
can easily be replaced with the more up to date
DiskCount
check.
- The universal check_netapp_anycli has been deprecated and will
be removed with the next version due to security considerations.
- The collector get_netapp_cm.pl (as well as any new collectors
like get_netapp) require a license file in place. Please
consult the chapter License File in the documentation.
Released 2019-06-19
New
- check_netapp_time: Check for a valid ntp-server configuration
on the filer and alarm if the drift between the filers system-time
and the monitoring-server is getting too high.
- The Usage check
gets a new parameter
--aggr_over=<percentage>
to alarm only if
aggregates are both overcommitted and begin to fill up.
- Overall checks like Usage have a new parameter
--top=<n>
. Setting
--top to a number will reduce the number of instances, e.g. volumes
shown in the output to that number. This has been requested to
reduce the sometimes very long lists of volumes, disks and the like.
Improved
- PerfDisk got
an additional performance object disk_constituent. Using
disk_constituent (instead of disk) avoids getting
Multiple_Values returned instead of valid instance names. If this
new object is used for the check the getters object must be changed
as well!
4.2.0 New checks and a flexible way to remove Service Acknowledgements
Released 2019-03-11
New
- AutosizeMode
checks if the auto-size mode of all volumes matches a given preset
(grow, grow_shrink).
- Head has a new
object version. This both shows and monitors the DataONTAP version
of the nodes. Plus it prints the box-model onto the first line so
that it can be quickly searched in the GUI.
- SisStatus
flags volumes, whose compression and deduplication is not enabled.
- Service Acknowledgements can be removed by a custom-script using the
new parameter
--rm_ack_handler=<handler-script>
.
- check_netapp_unused_lun searches and flags LUNs which are
mapped but whose initiators are not logged in.
- DiskPathQuality
checks disk path qualities, reports i/o-error percentages and raises
a CRITICAL error whenever an error percentage is above zero.
Improved
- Completely rewritten example configurations. The
etc/*.cfg
-files
are taken as they are from a running monitoring-system in our lab.
So typos or outdated left-overs are almost impossible from now on.
4.1.0 Remove --node from check_netapp_cluster, fix and improved documentation
Released 2018-12-06
Fixed
- The storefiles written by the getters from the 4.0.x releases were
several times bigger than the ones from older versions. We have
fixed that now.
- The input-values of
--object
is validated in
check_netapp_snapshots.
- VolumeAutosize
failed on MetroClusters because of undefined values. These volumes
are skipped now.
Changed
- Remove
--node
from check_netapp_cluster: The
check_netapp_cluster is divided into
check_netapp7_cluster for legacy 7-Mode systems and the
cdot-logic for checking the cf-status has been incorporated into
check_netapp_takeover.
- Checks using system-cli have got an improved documentation.
4.0.2 Fixes
Released 2018-11-07
Fixed
- Updated the cpanfile and the chapter on Required Perl Modules in the
installation document.
- The parameter [--max_records]{.title-ref} is evaluated (this
speeds up getters which pull a large number of instances.)
4.0.1 Refurbishment of the whole suite
Released 2018-10-23
Fixed
- The getters
--stm
has been reenabled (in 4.0.0 it was silently
ignored which made UsageTrend failing).
4.0.0 Refurbishment of the whole suite
Released 2018-10-15
As the 3.x versions of check_netapp_pro has got now over 3 years old
it is time for some changes in both the libraries and the user
interface.
So we take this major release as an opportunity to remove deprecated
arguments and introduce new ones. This will require some changes in
the configuration (something we avoid whenever possible in non-major
releases)!
Fixed
- ShelfEnvironment
--what=voltage
failed (exit 255) in case of a PSU error. Failed
PSUs are reported as CRITICAL now.
Changed
- Completely rewrote the cluster-mode getter to make it more efficient
and better maintainable. Several objects have now their own
dedicated getter (disk, snapshot, ...).
- Introduce the
--support
switch as a replacement for --mammamia
.
(This applies to all getters with the exception of 7m and perfdata.)
- Removed
--vserver|--node|-s
parameter from mosts checks. Now these
checks have the node- or vserver-name prefixed to every
instance-name. So instead of the old --node=my-node
one can filter
now with --include=^my-node
. This also obsoletes the
--svm_in_name
switch, we had implemented in various checks. Also
this switch has been removed since that behavior is the default now.
- The call-files are grouped into subfolders per host now. This change
will not be visible on sites which start with an empty storedir.
(Background: This change reduces the number of files per directory
to avoid negative implications on file-system performance.) Sites
which want to keep their existing (long-term)-stores must run
call2dir.pl
once.
Minor Changes
- The default of
Raidstatus
--ok_state
includes also mixed_raid_type, hybrid.
New and improved
- New
PerfQtree
which can monitor some ops-counters on the q-tree level.
- Enhanced check_netapp_takeover: New option
--ic_check
checks
the interconnect-links and reports how many of them are up or down.
- ShelfEnvironment
can have thresholds for the temperature-object in addition to the
sensor-state check.
- Snapshots-check:
Add additional filter --check_only=with(out)_lun
3.10.3 Fix several direct-check for 7m backwards-compatibility
Released 2018-04-23
Fixed
The following checks are backwards compatible with 7m again:
- check_netapp_cluster
- check_netapp_health
- check_netapp_license
- check_netapp_scrub
- check_netapp_spare
Plus improved the help-text in many other checks.
3.10.2 Important Bugfix for ServiceProcessor, Update-Mode for Getters, Grafana-Compatibility
Released 2018-03-06
New and improved
- Remove
--svm_in_name
switch.
- Getters are getting smarter with the new update mode (e.g.
--update=3min
)
- New
--perf_format=grafana
: This allows perf-data also for
status-checks which is important to visualize them in Grafana.
- New check_netapp_unused_lun: Checks for luns which are online
but do not have an initiator connected.
- PerfSysNode
has got a new parameter:
--math=average|total
.
- UsageTrend:
Additional parameter
--reduced_history=OK|WARNING|CRITICAL|UNKNOWN
allows to specify at which level a reduced-history condition is
communicated. Defaults to WARNING which is the behavior we had
hardcoded so far.
- New check_netapp_process: Checks for runaway processes on a
filer.
Fixed
3.10.1 Improvements f. UsageTrend, ASUP-check & LunAlignment, DataONTAP 9.3 compatible
Released 2018-02-13
New and improved
- The checks and getters are compatible with the upcoming DataONTAP
9.3
- Check
LunAlignment
has a new switch:
--show_misaligned_luns
.
- Check
UsageTrend
supports two new filter- and display switches already known from
other checks:
--check_only=without_lun|with_lun|...
and
--svm_in_name
.
- The ASUP-log-check will now consider retransmitted messages and has
got a
--lookbehind
filter to ignore messages which are too old.
- UsageTrend
handles missing data in a store more flexible (tries to calculate
with reduced look-behind).
- DiskPaths2
is a planed replacement for the deprecated DiskPaths check.
DiskPaths2 has a different UI and is in an early alpha-state. Do not
rely blindly on the config-sets delivered with this check!
Fixed
- Check Job
correctly shows the number of queued jobs in its perf-data. Also
enhanced the --help and examples.
3.10.0 PerfSysNode, PerfAggregate, LunAlignment, ASUP-Check
Released 2017-12-07
New and improved
- New check
PerfSysNode
(amends PerfSys for clusters with DataONTAP 8.3+)
- New check
LunAlignment
- New check
PerfAggregate
- New check
check_netapp_asup.pl
monitors the autosupport-log for
failed transmissions.
- Much better documentation for the growth metric of
OvercommitAggr
- Add option
--check_only=with_lun|without_lun
to
SnapshotLessVolume
- Enhance
UsageTrend:
--what=inodes
interpolates the usage of inodes and warns if the
number of available inodes will be reached within a given time.
- Enhance check_netapp_takeover: checks the metro-cluster
configuration
- Snapshots
--older_than
and --younger_than
accepts also seconds, minutes
and hours (plus days and weeks as in previous versions).
Fixed
- Better error handling if
--storedir
points to a non-writable
directory
- Sis: Handling of
volumes where an error occurs while fetching the status values
- Head: Fixed
several more and less cosmetic issues when one node is down
- Disk: Fixed
--what=non-zeroed-spare
- UsageTrend:
Fixed handling of offline or otherwise not available volumes.
3.9.2 ServiceProcessor Check
Released 2017-09-13
New and improved
Fixed
- Usage: [--check_only]{.title-ref} and
[--svm_in_name]{.title-ref} are compatible now
- check_netapp7_vfiler handles [--check_network]{.title-ref} even
if vfnet is empty
- check_netapp_quotas: Option [--path]{.title-ref} filters by
qtree-path now
- Distribution: takeover- and quota-check are now distributed as part
of the ADVANCED Bundle
- Documentation: missing api aggr-scrub-list-info in [list_apis.pl
--mode=7m]{.title-ref}
3.9.1 Fixes + new Takeover and Job-Check
Released 2017-08-30
Fixed
- Improved the error message in case of too less data in perf-stores.
- Fixed the error-message in case a filer does not return a
counter-value.
- Removed example from Snapshots which shows how to search with
Snapshots for volumes without Snapshots, as this did not work as
expected. Using Snapshots to check for snapshot-less volumes will
result in false negatives! Existing customers should check their
configuration and replace these service-checks with the new
SnapshotLessVolume
check.
- Fixed a bug which stoped UsageTrend from working, if some
instances have been created within the look-behind-period
New and improved
- New: check_netapp_takeover - checks if the take-over facility is
enabled and takeovers are possible
- New check:
SnapshotLessVolume -
searches for volumes without a snapshot.
- New Check Job -
checks for failed jobs.
- New switch [--svm_in_name]{.title-ref} for the
Usage-check
prefixes the volumes name with the vserver. This avoids confusing
duplicates in the output.
Released 2017-07-11
Important
Due to a change in the store-format existing store-files must be
converted (or deleted) before upgrading! Please read the installation
instructions.
Fixed
- --mammamia for 7m-getter did not write into storedir
- provide --check_only=without_lun_7m for 7m-filers
- check_netapp_anycli: Document switches
--like_result
and
--unlike_result
- get_netapp_shelfenvironment: Do not stop if one node is down
- check_netapp7_vfiler.pl: Handle undefined vfiler-status
- SnapMirrorState: Avoid UNKNOWN caused by no longer existing
snap-mirrors
- SnapMirrorState/Metric: Excluding all relations and setting
--no_instances=OK is ok now
New and improved
- New check:
LunSize
- Faster, leaner and more reliable store-format (existing store-files
must be converted before upgrading! Please read the installation
instructions.)
- New switch:
--ignore_case
(ignore case for --exclude
and
--include
patterns)
3.8.1 Quota check and workarounds
Released 2017-05-04
New and improved
- New check: check_netapp_quotas.pl - monitors quotas on a cdot
filer.
- StorageUtilization
output units can be controlled by
--factor=ki|Mi|Gi|...
- DiskPaths:
Bugfix and additional switch
--port_pattern_ok=AAAA | BBBB | ...
- ShelfEnvironment
False positives regarding the shelf-status solved with switch
--workaround_for_279931
- ShelfEnvironment:
False positive for voltage-sensors in DataONTAP 9.1P1 (solved with
temporary switch
--DataONTAP_91P1
)
3.8.0 FCPAdapter-Check, Usage --check_only=<range>, PerfVolume --math=average|total...
Released 2017-03-24
New and improved
- New check:
FCPAdapter
checks fcp-adapters status (online, ...)
- Usage-check:
--check_only=500GiB..1TiB
(ranges to specify different thresholds
depending on the volumes total size)
- New switch in
PerfVolume:
--math=average|total
- New metric in
Sis:
changelog-used-percent
- New switch
--ignore=^temp__
in perfdata-getter - this is the
default!
- New switch
--node|--vserver
-switch for
VolumeAutoSize
check
- New switch
read_ops
- and write_ops
-counter for
PerfDisk
check.
- New switch
--exclude=<counter>
in perfdata-getter. Required to
exclude the newly added _ops
-counters on older filers.
- Specific cdot-getter for the
ShelfEnvironment
-Check; this new getter does not need the
--node
parameter any
more.
Fixed
- ClusterPeerHealth:
Returns "UNKNOWN - No matching instances found" instead of "no
data" if no peer has been found
3.7.1 USABILITY IMPROVEMENTS and FIXES
Released 2016-12-16
New and improved
Fixed
- Snapshots:
--show_problems
works now as documented
- Sis: handles sis-disabled volumes (skips them if
--check_disabled
is not set)
- PerfDisk: tolerance was missing
3.7.0 MULTI-STORE, MULTI-TENANCY CHECKS, SIS CHECK
Released 2016-11-08
New and improved
- All getters can write to more than one storedir. This allows to
check from more than one peer/poller. ATTENTION 1: You need either
to start with an empty storedir or migrate existing stores. (See
also section Upgrading to Version 3.6.x and 3.7.x in the docs.)
ATTENTION 2: Storedirs do not get created automatically any more.
Add
--autocreate_storedir
to the getters to get back to the old
behavior.
- Direct checks with overall-logic (e.g. check_netapp7_fcpstats,
check_netapp_scrub) can write their instance_results into more
than one storedir.
- New Multi-Tenancy Checks: ReportSpace, ReportIOPS
- New Check Sis - check for dedup-problems (stale-fingerprints,
run-time)
- New check check_netapp_scrub to alarm if last scrubs timestamp
is over a certain age.
- Disk check has thresholds now - get a WARNING for 1 failed disk
and a CRITICAL alarm for 2 or more.
--max_records
can be configured to reduce the time to collect data
(volume, vol_snapshot, disk) from cdot-filers
- Snapshots can check the size of single snapshots now
(
--single
).
Released 2016-09-03
New and improved
- A new format and logic for some store-files dramatically reduces the
memory-footprint of checks using performance- and history-stores.
- New check_netapp_license.pl
- New switch
--vserver
for LunState
- New switch
--prefix
for PerfLif
- New switch
--uninitialized_relation_is_ok
for
SnapMirrorMetrics (Useful if you setup a far away vserver peer
and activated protection with snapmirror - synchronization can take
a long time than.)
- New switch
--exclude_dependency=vclone
for the Snapshot check
to filter on application-dependency
- Hint in output if instances have been excluded
- Snapshots-check:
--node|--vserver
is optional now
Fixed
- ShelfEnvironment fixed to check 7m MetroCluster
3.5.2 REACTIVATE max_length_multi_line
Released 2016-07-06
New and improved
- Length of output can be limited by setting
max_length_multi_line
in check_netapp_pro.pl
3.5.1 --average IMPLEMENTED into PERFDISK
Released 2016-06-12
New and improved
- --average implemented into PerfDisk
3.5.0 CLUSTER-PEER-HEALTH, NO INSTANCES CAN BE OK NOW!
Released 2016-06-08
New and improved
- New check ClusterPeerHealth
- New check MetroClusterVserver
- New option for the collector-checks:
--no_store=OK|...
(most
useful together with the collectors option --no_instances=OK|...
)
- DiskPaths: unassigned disks are skipped
- VolumeState: Node-filter
--exclude_nodes
/--include_nodes
introduced
3.4.2 FIXES USAGE/SHELF-ENVIRONMENT, DISKCOUNT, WORKLOAD, LUNSIZE
Released 2016-04-21
Fixed
- Usage-Check: Handling of volumes with no state-attribute
- shelf-environment-object in cm-getter needs a node-name
(sometimes)
- snap-mirror getter for 7-mode: Handle empty config (no snap-mirrors)
- Respect the options
--timeout
and --hide_uuid
New and improved
- new DiskCount check (counts spare-disks per type or
storage-pool)
- new workload-object in the perf-getter (in preparation for a cdot
total-ops check)
- extended lun-object in the 7m/cm-getters (in preparation for a
LunSize check)
3.4.1 SWITCH --VSERVER|--NODE OPTIONAL FOR USAGE-CHECK, FIXES, DOCUMENTATION
Released 2016-03-07
Fixed
- Usage check: unknown volumes are skipped now
- Snapshots: target type not properly shown in case of singular
New and improved
- Switch
--vserver|--node
is now optional for the Usage check
- Documentation and examples
3.4.0 SHELF-CHECKS, UNZEROED-SPAREDISKS, UNBALANCED DISKS, METROCLUSTER-CHECKS
Released 2016-02-25
Fixed
- Vserver checks operational-state (instead of admin-state) if
DataONTAP >= 8.3
New and improved
- New: ShelfEnvironment checks a Shelfs temperature,
power-supplies and sensors
- New: Check for non-zeroed spare disks with the Disk check
- New check BadlyPerformingDisks for detecting unbalanced disks
usages.
- New check_netapp_mc_config.pl for checking a Metroclusters
mode and configuration.
- New check_netapp_anycli.pl for building checks with simple
CLI-commands.
- New option
--alarm_limit
(signals critical conditions as warning)
- check_netapp7_snapvault.pl can check the lag-time
- New option
--display_factor
for check_netapp7_snapvault.pl
- New option
--skip_reason
in Vserver check
- New option
--avg
in PerfCpu (for getting the averaged usage)
3.3.1 FABRIC METRO CLUSTER SUPPORTED, USER-FRIENDLY TIMES, FIXES
Released 2015-12-01
New and improved
- DiskPaths can check Fabric Metro Clusters with 8 disk-paths
- User-friendly time-values in addition to seconds: h, min, d(ay),
(w)eek E.g.
-w 3 -c 72 --factor=h
instead of -w 10800 -c 259200
- New Option: --perfdata_uom_string to replace correct but not
accepted units like '/s' with whatever you want. E.g.
--perfdata_uom_string=s|persec|empty|°C|µs|us|...
- IfGrp lists ports which are down
- NetPort/NetPort7m: new switch
--ignore_disabled
- Getter for shelf-bay objects can retrieve data from a
--single_system
Documentation
- Handling of Error "Magic number checking on storable file failed"
Fixes
- NetPort handles undefined link-status (sends an alarm)
- Usage of check_netapp7_vfiler.pl mentioned wrong switch
3.3.0 MORE LEGACY CHECKS, DISK INVENTORY INFO, CHECK for FCP CRC-ERRORS
Released 2015-11-04
New and improved
- SnapMirror has a getter for 7-mode
- New Check for 7-mode: check_netapp7_snapvault
- New Check for 7-mode: check_netapp7_vfiler
- New Check: check_netapp7_fcpstats: Checks for fcp crc-errors
(7mode only)
- Disk shows inventory info (so actions can be taken immediately)
- New Switch to hide the lengthy UUID in output
- New: Snapshots can now filter on the snap-mirror-label
- Additional counters (evicts, disk_reads_replaced) in the
FlashCash-Check
- Preparation for DiskPath: path_use_state gets collected, required
for checking Fabric MetroClusters (FMCs)
Fixes
- interface-getter/IfGrp check: handle interfaces with undefined
mediatype or empty link-states
- IfGrp check can handle vifs-in-vifs recursively
- The labels in the perf-data are always uniq now
Documentation
- New chapter: How to configure a cluster-mode read-only user for
monitoring
- Several improvements, mainly in the Trouble-Shooting section.
3.2.0 NEW CHECKS and SWITCHES, FIX f. DUPLICATE INSTANCES BUG, MICRODELTAS
Released 2015-08-27
New and improved
- new check: PerfTcpIp
- new check: NetInterface
- new check: PerfHostadapter
- new check: UsageTrend
- new check: SyncMirror for cdot Metro Cluster
- new switch: --no_instances=OK...
- new switch for volume Usage: --check_only=with_lun|without_lun
- alarm (UNKNOWN) in case of counter overflows
- microdeltas (experimental feature)
- check_netapp_spare.pl: Better (more human understandable) help and
output
Fixes
- duplicate perf-instances in cmode if the number of instances was >
150
3.1.1 LEGACY CHECKS, CHECK FOR BROKEN AND- UNASSIGNED DISKS
Released 2015-03-25
Fixes
- IMPORTANT: broken disks are no longer skipped in cluster mode
- disk getter handles unassigned disks
New and improved
- Legacy Checks ported from 2.x to 3.x:
check_netapp7_head.pl
,
check_netapp_health.pl
- Filer can be checked for broken or unassigned disks
3.1.0 STABLE RELEASE
Released 2015-03-13
Fixes
- some small fixes in the
cfg
-files
- removed confusing default-thresholds in PerfVolume
New and improved
- New switch
--show_problems
appends all non-ok instances to the
outputs first line. Ideal for sending more comprehensive alarm
messages to mobile-phones. (More details are in the
blog)
3.0.18 RELEASE CANDIDATE II
Released 2015-02-25
Fixes
- StorageUtilization returns 25 out of bounds
- snapshot-getter can handle offline/restricted/busy volumes
New and improved
--no_relations_is_ok
switch for SnapMirror checks
- DiskPaths can check up to 4 paths
- Documentation improvements
3.0.17 RELEASE CANDIDATE I
Released 2015-01-26
New and improved
- VolumeAutosize for cm and DataONTAP 8.2.1
- PerfLif for DataONTAP 8.2.1 or higher
- New check Disk (replaces the deprecated DiskFailed)
- New option-argument to switch of performance-data:
--perf_data=suppress
Fixes
- Lots of fixes and enhancements based on feedback from internal
UA-tests
- Lots of fixes and enhancements based on customer-feedback (thanks to
all of you!)
- Reduced ONTAPI requirement for all 7m checks (1.11 instead of 1.15)
Documentation
- Much more complete configuration files (Nagios cfg) for 7m and cm
- Getter tells user correct getter-objects if a unknown object is
detected
- Getter: Pseudo-object 'explore' lists all available objects
(
-o explore
).
3.0.16 NETPORT7M, FIX in SHELFBAY, ENHANCED CLUSTER-HA-CHECK
Released 2014-10-15
- Fixed: ShelfBay getter died with non-installed disks
- New: NetPort7m checks interface-status on 7-mode filers
- Enhanced: check_netapp_cluster checks 'is-enabled' and
'is-interconnect-up'
- Updated: Installation Documentation
Released 2014-10-06
- Improved: Reduced amount of data retrieved from filer (volume,
vol_snapshot in cluster-mode)
- Debug-output of perf-getter extended to show timestamps for both
instance-name-and counter-value retrieval period.
3.0.14 LUNSTATE, STORAGEUTILIZATION
Released 2014-10-02
- New check: LunState check for offline-LUNs
- Now available in pro: StorageUtilization
- Fix for problem on systems with a patched Carp-module causing
the getter to stopped with "Do you need to predeclare croak?"
3.0.13 VSERVER, DiskFailed, SnapVault
Released 2014-09-16
- Renamed: Vfiler has been renamed to Vserver.
- Enhanced: DiskFailed can now check for offline-disk on 7m filers
- SnapVault included into SnapMirror, relationship-type visible
in
--explore
3.0.12 VFILER, WAFL
Released 2014-08-21
- New check: Vfiler monitors the state of data-vfilers.
- New check: Wafl monitors WAFL-counters like consistency points
per second.
3.0.11 SEVERAL FIXES AND DOCUMENTATION IMPROVEMENTS
Released 2014-08-14
- Fixed NVRAM
- Relaxed ONTAPI Requirement for 7-Mode
- utf8 conversion of all libraries
3.0.10 OVERCOMMIT-AGGR, VOLUME-AUTOSIZE
Released 2014-07-14
- New check: VolumeAutosize alarms if autosized volumes grow too
fast
- Improvement: --explore/ --discover
- Fix: Removed dependency from antlers
- Feature: OvercommitAggr provides additional growth metric
3.0.9 SNAPMIRROR
Released 2014-06-24
New Checks in this version
- SnapMirrorMetrics - checks SnapMirrors lag-time, transfer-size,
...
- SnapMirrorState - checks SnapMirrors health and mirror-state.
New Feature
- SnapShots w/ new switch
--comparison=lt
find volumes with no
snapshots
3.0.8 MORE PERF-CHECKS
Released 2014-06-03
New Checks in this version
- BufferCache - checks several metrics of the system buffer cache
(=system memory)
- FlashCash - checks the PAM2-cards performance
- LunLatency - checks a luns latency or operations/second
3.0.7 REDUCED DEPENDENCIES, PERFSYS, MAMMAMIA
Released 2014-06-02
Fixed
- IfGrp was broken for cluster-mode. Added
--node
switch to
allow selective checking of a specific node.
New Checks
- PerfSys - checks system-wide perf-counters (transfer-rates, ops,
...)
Updated
- Removed dependency from non-core module Const::Fast antlers
(installation should be able without CPAN now)
- Performance improved by using faster accessors
- PerfIf and PerfCpu got a
--node
switch, to selectively
check a specific node.
New Feature
- --mammamia - setting one single switch writes several
debug-infos into one, easy to compress and transfer directory. Now
supported for *all* getter-scripts!
3.0.6 PERFVOLUME's --vserver, --mammamia
Released 2014-05-19
Fixed
PerfVolume -H <myfiler> --vserver=vserv01
checks only volumes of
vserv01.
New Feature
- --mammamia - setting one single switch writes several
debug-infos into one, easy to transfer directory. Only supported for
check_netapp_perfdata.pl
yet. (Had erroneously been announced for
3.0.5 already)
3.0.5 VOLUME-PERFGETTER USES VSERVER NOT NODES
Released 2014-05-19
Fixed
check_netapp_perfdata.pl --volume
uses vserver not node as
container for the volumes.
3.0.4 SIMPLIFICATIONS AND NEW PERFCHECKS, DISKFAIL-CHECK, FIXES
Released 2014-05-15
Simplified
- Performance-getter and performance-checks always have a 1:1 relation
now. (E.g. PerfVolume requires just one getter and not two.)
- Perfgetters
--mode
switch can be written as attribute into
netapp_credentials
(auth-file). See the example in etc
.
Fixed
- Usage for volumes, VolumeState handle down-nodes
- ShelfBay
--state
is mandatory now
- ShelfBay
--state=disk
false-negatives eliminated (may be to strict
now - please report)
New Checks
- PerfCpu - checks net-ports link-status on cm-filers. Replaces
IfConfig.
- NVRAM - checks NVRAMs performance.
- PerfIf - checks network-interfaces performance.
- DiskFailed - checks for failed disk.
3.0.3 SHELF-BAY/HEAD/UPTIME FIXED, NEW NET-PORT CHECK
Released 2014-05-07
Fixed
- ShelfBay check found duplicate bays on FAS 2240
- Head and Uptime handle rebooting nodes better, additional switch
--vserver
New Checks
- NetPort - checks net-ports link-status on cm-filers. Replaces
IfConfig.
3.0.2_02 IFGRP for CLUSTER MODE (EXPERIMENTAL)
Released 2014-04-30
New Checks
**IfGrp** - supports cluster-mode. Use -o net-port for the getter.
: (experimental - please report!)
3.0.2_01 EXPERIMENTAL BUGFIX + NEW PLUGINS
Released 2014-04-29
Fixed
- Getter for shelf-bay fixed (experimental - please report!)
New Checks
- VolumeState - checks for non-online volumes (configurable)
- AggregateState - checks for non-online aggregates (configurable)
- Raidstatus - checks the raid-status of aggregates
3.0.2 FIRST BETA - BUGFIXES
Released 2014-04-23
Fixed
- debugging printed w/o verbose
- usage in Snapshots
- message in Heads
3.0.1_14 FOURTEENTH ALPHA - Heads
Released 2014-04-11
Implemented
- Uptime - reappeared for cm. Checks the time since last reboot.
- check_netapp_cluster - direct check (w/o collector) - checks
the HA-status
3.0.1_13 THIRTEENTHS ALPHA - Heads
Released 2014-04-07
Implemented
- Heads - reappeared for cm. Checks hardware and global
health-state.
Fixes
Several bugs and UI-inconsistencies repaired.
3.0.1_12 TWELTH ALPHA - PerfDisk
Released 2014-04-02
Implemented
- NEW: PerfDisk - checks disk_busy counter on 7m and cm.
3.0.1_10 TENTH ALPHA - several additional checks
Released 2014-03-24
Implemented
- NEW: SyncMirror - checks the mirror-status on Metro Cluster
aggregates
- NEW: IfGrp - checks if an interface-group has enough active
links (replaces IfConfig)
- NEW: DiskPaths - checks if each disk has two paths (A/B, B/A)
3.0.1_06 SIXTH ALPHA - ShelfBay (Shelf- and Disk Port Status Monitoring)
Released 2014-03-11
Implemented
- NEW: ShelfBay - checks the status of shelves and the disk-port
status (e.g. BYP, disk bypass).
- Enhanced help (
--help
) for all checks (examples).
3.0.1_05 FITH ALPHA - PerfVolume (Per Volume Latency) gets PRO
Released 2014-03-07
Implemented
- NEW: PerfVolume - checks as many volumes as you want - no more
'Not enough memory to get instances'.
- NEW: PerfVolume has additional counters for CIFS and FCP and
checks both 7-mode and cluster-mode filers.
- NEW: get_netapp_perfdata.pl lists all available counters from
your system (try it out:
get_netapp_perfdata.pl -H filer -o volume ... --explain=counters
).
3.0.1_03 + 3.0.1_04 SECOND and THIRD ALPHA of combined PRO-Release (7M/CM)
Released 2014-02-28
Implemented
- FIX: Authentication via credentials file fixed
- NEW: check_netapp_spare - checks for spare-low-conditions (not
enough spare-disks) on nodes and clusters (both 7m and cm)
3.0.1_02 SECOND ALPHA of combined PRO-Release (7M/CM)
Released 2014-02-02
Implemented
- New architecture which supports both DataONTAP versions (7m,
cluster-mode).
- Aggregate snapshots in 7-mode
The following checks should work for both 7m and cm:
- Usage (volumes, aggregates)
- Snapshots (volume- and aggregate-snapshots for 7m, volume-snapshots
for cm)
Missing
- A check for Shelfs, which will replace together with a Head-check
the old Hardware-check.
- Head, Performance Checks
3.0.1_01 FIRST ALPHA of combined PRO-Release (7M/CM)
Used only internally.