17. 12. 2019 Juergen Vigna NetEye, Predictive Analysis, Unified Monitoring

Monitor Cluster Disk Space with Icinga2

The Problem

If you’re trying to monitor a Microsoft Cluster, you’ll surely want to monitor the disk space of a single cluster service. In this case there’s a problem with Icinga2 Agent:  you can’t use it with more than one IP address.  So you can’t simultaneously monitor the resources of the “physical” host and a “virtual” service host that’s running on that same host. The crux of the problem is that the host name is checked against the name encoded in the certificate, and you cannot use multi-host certificates or certificates with wildcards.

We’ve come across this problem and are trying to find a solution. In the meantime, I wrote a plugin named check_influx_diskspace_cluster.pl that you can download here.  It tries to solve the issue by using the performance data from a fixed check on every cluster node, and writing it out to InfluxDB.

How Does It Work?

You have to add a disk check to every physical node of the cluster, which will check all disks on that node.  It works best to set very high warning/critical values, so that you won’t get alerts for this service. The service will take the performance data for all disks it finds and write it out to Influx.

Suppose for example that you need to monitor disk R: for your cluster service. You have to tell the plugin the name of the metric you want to search for (R: in our case), along with the names of the cluster hosts on which to search for that disk. It then queries the last entry in InfluxDB for this metric on all of the hosts. As the resource will only be mounted on one of the nodes, you will get the info from where it actually runs and the space it uses, so that you can check it against warning/critical values you gave as parameters.

Plugin Parameters

# /neteye/shared/monitoring/plugins/check_influx_diskspace_cluster.pl --help
check_influx_diskspace_cluster.pl 1.0.0
This nagios plugin is free software, and comes with ABSOLUTELY NO WARRANTY.
It may be used, redistributed and/or modified under the terms of the GNU
General Public Licence (see http://www.fsf.org/licensing/licenses/gpl.txt).

Gets last value for given disk metric for more servers and returns the first found instance. All values are retreived from the influxdb, status, max, warning, critical

Usage: check_influx_diskspace_cluster.pl [-H <influxdb hostname/IP>] [-p <influxdb port>] -S <regex-hostname>
[-M <measurement-name>] -m <disk-metric-name> [-w <warning>] [-c <critical>]
[ -V ] [ -h ]

-?, --usage
Print usage information
-h, --help
Print detailed help screen
-V, --version
Print version information
--extra-opts=[section][@file]
Read options from an ini file. See http://nagiosplugins.org/extra-opts
for usage and examples.
-H, --host=<hostname>
influxdb hostname (Default: influxdb.neteyelocal)
-p, --port=<influx-port>
influxdb tcp port (Default: 8086)
-M, --measurement=<name>
influxdb measurements to use (Default: disk-windows)
-S, --server=<servers>
Cluster Servers to get the disk-values from. This is a coma separated list of server-names as found in the monitoring.
-m, --metric=<string>
disk-metric string to search for
-D, --debug
Give DEBUG output
-w, --warning
warning value for % free disk (if not defined get it from DB)
-c, --critical
critcal value for % free disk (if not defined get it from DB)
-t, --timeout=INTEGER
Seconds before plugin times out (default: 30)
-v, --verbose
Show details for command-line debugging (can repeat up to 3 times)
-v, --verbose
Give verbose output

Copyright 2019 WuerthPhoenix

Example Call for the Plugin

check_influx_diskspace_cluster.pl -M R: -S server[123] -w 10 -c 5

Plugin Output

OK – DISK free space: R:(server2) 15453.00 MB (39%) | R:=16203644928;4189270835.2;2094635417.6;0;41892708352

Juergen Vigna

Juergen Vigna

NetEye Solution Architect at Würth Phoenix
I have over 20 years of experience in the IT branch. After first experiences in the field of software development for public transport companies, I finally decided to join the young and growing team of Würth Phoenix. Initially, I was responsible for the internal Linux/Unix infrastructure and the management of CVS software. Afterwards, my main challenge was to establish the meanwhile well-known IT System Management Solution WÜRTHPHOENIX NetEye. As a Product Manager I started building NetEye from scratch, analyzing existing open source models, extending and finally joining them into one single powerful solution. After that, my job turned into a passion: Constant developments, customer installations and support became a matter of personal. Today I use my knowledge as a NetEye Senior Consultant as well as NetEye Solution Architect at Würth Phoenix.

Author

Juergen Vigna

I have over 20 years of experience in the IT branch. After first experiences in the field of software development for public transport companies, I finally decided to join the young and growing team of Würth Phoenix. Initially, I was responsible for the internal Linux/Unix infrastructure and the management of CVS software. Afterwards, my main challenge was to establish the meanwhile well-known IT System Management Solution WÜRTHPHOENIX NetEye. As a Product Manager I started building NetEye from scratch, analyzing existing open source models, extending and finally joining them into one single powerful solution. After that, my job turned into a passion: Constant developments, customer installations and support became a matter of personal. Today I use my knowledge as a NetEye Senior Consultant as well as NetEye Solution Architect at Würth Phoenix.

Leave a Reply

Your email address will not be published.

Archive