Alarms and monitoring go hand in hand. Whenever an algorithm or threshold is used to decide whether the current value of a registered KPI should rise an alarm or not the result can be a hit, a correct reject, a miss or a false alarm.
The standard way to rise alarms is studying standard traffic – which should not rise alarms – and deciding on a static threshold based on the historic standard traffic (For example see Figure 1) and experience. Everything below the threshold is than considered as standard traffic and everything above rises an alarm. This kind of threshold-based alarm creation is robust to many outliers and might be sufficient if the mean of the standard traffic does not change dynamically (in that case the threshold needs to be adapted dynamically, too). Signals might contain also anomalies that are quite useful for problem detection that look very different from classic (more or less extreme) outliers. For example a change in the distribution or similar (see Figure 2, red area on the right) can be a first sign of instability and taking an immediate counter-action can prevent the anomaly turning into a real problem.
For this reason the study of alternative more sophisticated alerting mechanisms is a useful addition to current common practice. By being able to differentiate between different types of anomalies and also detecting those that could not have been found by traditional methods one gets a step forward when monitoring KPIs from more and more complex network traffic or performance counters. Würth Phoenix is currently putting effort into the analysis of methods coming from the field of statistics and machine learning for alarm generation to guaranty a sound alarm quality to their customers.
For example methods based on signal decomposition (see above) where the signal is first split into a trend, seasonal components that repeat periodically, and the close study of the residual activity have already shown promising preliminary results (see below).
How can such a more sophisticated analysis of the traffic help to create smarter alarms?
Especially when configuring “unknown” systems such as new applications or networks from new customers that your solution is expected to monitor it is not always easy to learn what standard behaviour should look like. You need time to build your experience, automatic recognition based on anomaly detection “only” needs data (obviously also here the alarm quality can be expected to improve with historical data).
Especially the combination of standard and traditional methods is quite interesting. Goal here is for example to use anomaly detection to filter the most relevant from the alarms that have been detected by traditional methods to avoid false alarms, as a first step into a promising direction.