Next Level Performance Monitoring – Part II: The Role of Machine Learning and Anomaly Detection

Posted by on Aug 2, 2017 in NetEye, Real User Experience Monitoring | 0 comments

Machine learning and anomaly detection are being mentioned with increasing frequency in performance monitoring. But what are they and why is interest in them rising so quickly?

From Statistics to Machine Learning

There have been several attempts to explicitly differentiate between machine learning and statistics. It is not so easy to draw a line between them, though.

For instance, different experts have said:

  • “There is no difference between Machine Learning and Statistics” (in terms of maths, books, teaching, and so on)
  • “Machine Learning is completely different from Statistics.” (and the only future of both)
  • “Statistics is the true and only one” (Machine Learning is a different name used for part of statistics by people who do not understand the real concepts of what they are doing)

The interested reader is also referred to:
Breiman – Statistical Modeling: The Two Cultures and Statistics vs. Machine Learning, fight!

In short we will not answer this question here. But for monitoring people it is still relevant that the machine learning and statistics communities currently focus on different directions and that it might be convenient to use methods from both fields. The statistics community focuses on inference (they want to infer the process by which data were generated) while the machine learning community puts emphasis on the prediction of what future data are expected to look like. Obviously the two interests are not independent. Knowledge about the generating model could be used for creating an even better predictor or anomaly detection algorithm.

Read More

Next Level Performance Monitoring – Part I

Posted by on Jun 20, 2017 in NetEye, Network Traffic Monitoring, Real User Experience Monitoring | 0 comments

Network traffic keeps becoming more and more heterogeneous. In many cases, it is not enough to monitor a system as we have done in the past. Here I will present the key ingredients according to Würth Phoenix for successful state of the art performance monitoring and proactive analysis of those applications that are critical for your business.

Combining User Experience and Performance Metrics for new Insights

User experience is a very important factor. If your measurements seem in the right range, BUT end users complain about slow applications, you need to act. For this reason, user experience combined with an overview of all the servers being put under monitoring is the right place to start. In our opinion it is of vital importance to know when critical business applications begin to slow down before your users start to complain. You can achieve this by running continuous checks via Alyvix – our active user experience monitoring solution. Test cases can be written specifically for the most vital parts of your applications, and the functionality and speed of those very parts can be checked as often as needed. The outcome in terms of performance of each individual user interaction tested is then saved into the same central time series data base as the performance metrics registered from all original sources of interest (such as Perfmon data, ESX performance data, etc.) It is then possible to perform a multiserver zoom and with a single click to navigate to the most interesting servers during time periods where Alyvix detected problems.

Screenshot from 2017-05-29 10:42:58

Read More

How to Tune Your Grafana Dashboards

Posted by on Feb 17, 2017 in NetEye | 2 comments

Grafana Tuning

Grafana and InfluxDB have been integrated to our IT System Management solution NetEye. This step was motivated by the high flexibility and variability offered by the combination of the two open source tools. Besides modules such as Log Management, Inventory & Asset Management, Business Service Management and many others, NetEye now offers also an IT Operations Analytics module. In this article, we will share some tricks with you, so you can enjoy even more of the power of Grafana when experimenting with the new Grafana dashboards in NetEye.

Read More

How to use anomaly detection to create smarter alerts

Posted by on Nov 11, 2016 in Network Traffic Monitoring, Real User Experience Monitoring | 0 comments

Alarms and monitoring go hand in hand. Whenever an algorithm or threshold is used to decide whether the current value of a registered KPI should rise an alarm or not the result can be a hit, a correct reject, a miss or a false alarm.


The standard way to rise alarms is studying standard traffic – which should not rise alarms – and deciding on a static threshold based on the historic standard traffic (For example see Figure 1) and experience. Everything below the threshold is than considered as standard traffic and everything above rises an alarm. This kind of threshold-based alarm creation is robust to many outliers and might be sufficient if the mean of the standard traffic does not change dynamically (in that case the threshold needs to be adapted dynamically, too). Signals might contain also anomalies that are quite useful for problem detection that look very different from classic (more or less extreme) outliers. For example a change in the distribution or similar (see Figure 2, red area on the right) can be a first sign of instability and taking an immediate counter-action can prevent the anomaly turning into a real problem.


For this reason the study of alternative more sophisticated alerting mechanisms is a useful addition to current common practice.

Read More

Congratulations to the winners of the NetCla Challenge

Posted by on Oct 5, 2016 in NetEye, Network Traffic Monitoring, Real User Experience Monitoring | 0 comments

More than 100 Teams were competing, more than 25 sent in a solution, the best reaching a Macro-F1 scorse higher 0.88.

Last Friday, after six long weeks, the time had finally come. During ECML-PKDD conference at Riva del Garda the best of the competing approaches have been described and discussed. The participants had the possibility to get answers directly from the organizers and last but not least Iryna Haponchyk – leader of the winning team – was awarded 1000 Euro for the solution with the highest macro-F1 score, or better for having created a model capable of producing such a score. Here you can see the beaming winner during the discovery callenge prize

Iryna explained her team trained a standard multi-class linear SVM classifier, having preliminarily enriched the presented attribute set with features generated using a random forest and features encoding the notion of interdependency between the examples that go close to each other in time.

Read More