check_netapp_ifgrp (Interface Groups)

Checks the health of NetApp interface groups (ifgrps) by comparing active ports against member ports.

Usage

$ check_netapp_ifgrp -H <host> [--partial=<status>] [--ignore-disabled] [...] [--help]

Description

This plugin monitors NetApp interface groups (ifgrps / LAGs). For each ifgrp it compares the number of active ports against the number of member ports and evaluates the result:

Condition Status
All member ports active OK
Some member ports active WARNING (configurable via --partial)
No active ports (all down) CRITICAL (always, regardless of --partial)
No member ports defined UNKNOWN

The check uses the same REST API endpoint as check_netapp_netport (api/network/ethernet/ports) but filters for ports of type lag only.

Instance Name Format

Each ifgrp is identified by {node}.{name}, for example cluster-01.a0a.

Difference to check_netapp_netport

check_netapp_netport monitors the link-state of individual physical ports. It can also see ifgrps — but only reports them as up or down. It does not detect partial degradation, where an ifgrp is technically up but some member ports have failed.

check_netapp_ifgrp fills this gap: it specifically monitors whether all member ports of an ifgrp are active, and alarms when some are missing.

Recommendation: Use check_netapp_netport for physical port monitoring and check_netapp_ifgrp for ifgrp health monitoring. You can exclude ifgrps from the netport check with --exclude="~lag$" to avoid duplicate alerts.

Parameters

Parameter Default Description
--partial WARNING Status to return when an ifgrp has some but not all member ports active. Accepts OK, WARNING, CRITICAL, or UNKNOWN.
--ignore-disabled false When set, ifgrps where enabled is false are skipped and not counted.
--include / --exclude Filter ifgrps by instance name. Supports strings and regular expressions.
--no-instances UNKNOWN Status to return when no ifgrps are found (e.g. system has no LAGs).

Examples

All ifgrps healthy

$ ./check_netapp_ifgrp -H mycluster
NETAPP IFGRP OK - 2 ifgrps checked
cluster-01.a0a: 2/2 active
cluster-02.a0a: 2/2 active

Both ifgrps have all member ports active.


Partial degradation (default: WARNING)

$ ./check_netapp_ifgrp -H mycluster
NETAPP IFGRP WARNING - 2 ifgrps checked, 1 WARNING
cluster-01.a0a: 1/2 active (WARNING)
cluster-02.a0a: 2/2 active

One ifgrp has lost a member port. By default this returns WARNING.


Partial degradation treated as CRITICAL

$ ./check_netapp_ifgrp -H mycluster --partial=CRITICAL
NETAPP IFGRP CRITICAL - 2 ifgrps checked, 1 CRITICAL
cluster-01.a0a: 1/2 active (CRITICAL)
cluster-02.a0a: 2/2 active

With --partial=CRITICAL, a partial degradation raises a CRITICAL alarm instead of WARNING.


Suppressing partial alarms

$ ./check_netapp_ifgrp -H mycluster --partial=OK
NETAPP IFGRP OK - 2 ifgrps checked
cluster-01.a0a: 1/2 active
cluster-02.a0a: 2/2 active

With --partial=OK, partial degradation is reported but does not trigger an alarm. The output still shows the port ratio so operators can see the state.

Even with --partial=OK, an ifgrp with zero active ports will always return CRITICAL.


Ifgrp completely down

$ ./check_netapp_ifgrp -H mycluster
NETAPP IFGRP CRITICAL - 2 ifgrps checked, 1 CRITICAL
cluster-01.a0a: 0/2 active (CRITICAL)
cluster-02.a0a: 2/2 active

An ifgrp with zero active ports is always CRITICAL, regardless of the --partial setting.


Ignoring disabled ifgrps

$ ./check_netapp_ifgrp -H mycluster --ignore-disabled
NETAPP IFGRP OK - 1 ifgrp checked
cluster-02.a0a: 2/2 active
Skipped 1 disabled instances

With --ignore-disabled, disabled ifgrps are excluded from the check entirely.


Filtering by node

$ ./check_netapp_ifgrp -H mycluster --include="~^cluster-02\."
NETAPP IFGRP OK - 1 ifgrp checked
cluster-02.a0a: 2/2 active

Checks only ifgrps on a specific node. See also Regular Expressions.


No ifgrps on the system

$ ./check_netapp_ifgrp -H mycluster
NETAPP IFGRP UNKNOWN - 0 ifgrps checked

If the system has no ifgrps, the check returns UNKNOWN by default. Use --no-instances=OK to suppress this alarm.