04. 01. 2011 Patrick Zambelli Capacity Management, Nagios

Monitoring Nagios Performance Graphs

This post introduces a check strategy to monitor the freshness of Nagios performance graphs.

When the feature is enabled Nagios is generating performance graphs, that are updated automatically with the execution of a single check. Such a check result delivers also the so called “performance data”, content interpreted by dedicated listeners behind the Nagios process. The results are stored and maintained within RRD databases.

After updates of Nagios checks the number or the names of data sources of a check restult may change. But within RRD Databases Datasources are fixed positions and the RRD Database itsfelf can not be extended with the RRDTool version 1.2. This leads to the problem, that the performance graph is not growing anymore and a NaN error occures. In the same moment the database is not updated anymore.

Example: The diskspace check retrieves an additional volume and therefore an additional data source.

The check

The check runs on the file structure of the archive containing the RRD files. Old files are identified and highlighted.

The arguments -a allows to define the maximum time in seconds to be passed since the last modification to determine a file to be “outdated”.

The arguments -w and -c define on the other hand the number of such old files to be found in order to get a warning or critical result.

Check result on perfdata archive

Check result on perfdata archive

An additional argument ‘-R’ enables the check to automatically remove eventual files reaching the freshness limit. In this case there will be removed the RRD (file.rrd) and the (file.xml).

A check result where old files had been found:

Download

check_perfdata

Help

[root@pbzsilx001 plugins]# ./check_perfdata.sh -h
Usage: check_perfdata.sh [-a max_file_age] [-w max_warning] [-c max_critical] [-R]
-h, –help    : this help

-a max_file_age: Integer of seconds since last file modification to determine that a RRD is old. [86400]
-w max_warning:  Integer of maximum number of old RRD files to get a warning [1]
-c max_critical: Integer of maximum number of old RRD files to get a critical [10]
-R               Set this argument to automatically remove a old RRD file

Usage examples:

./check_perfdata.sh -a 86400 -w 1 -c 10

Patrick Zambelli

Patrick Zambelli

Product Manager at Würth Phoenix
After my graduation in Applied Computer Science at the Free University of Bolzano I decided to start my professional career outside the province. With a bit of good timing and good luck I went into the booming IT-Dept. of Geox in the shoe district of Montebelluna, where I realized how a big IT infrastructure has to grow and adapt to quickly changing requirements. During this experience I had also the nice possibility to travel the world, while setting up the various production and retail areas of this company. Arrived at Würth Phoenix I started developing on our monitoring solution NetEye. Today, in my position as Product Manager, I aim to continuously improve our solutions and to adapt them to actual market requirements.

Author

Patrick Zambelli

After my graduation in Applied Computer Science at the Free University of Bolzano I decided to start my professional career outside the province. With a bit of good timing and good luck I went into the booming IT-Dept. of Geox in the shoe district of Montebelluna, where I realized how a big IT infrastructure has to grow and adapt to quickly changing requirements. During this experience I had also the nice possibility to travel the world, while setting up the various production and retail areas of this company. Arrived at Würth Phoenix I started developing on our monitoring solution NetEye. Today, in my position as Product Manager, I aim to continuously improve our solutions and to adapt them to actual market requirements.

Leave a Reply

Your email address will not be published. Required fields are marked *

Archive