14. 03. 2016
More Precise SLA Reports through Event Correction
Downtimes are an important part for the correct creation and interpretation of an SLA report. While downtimes and status changes can be scheduled, unexpected changes need to be retroactively fixed or sanctioned.
Let me give you a practical example to better explain what I mean:
Think about an ISP who defined a certain SLA with his customer, let’s say an availability of 99.8%. The ISP is sure to provide an excellent service to his customer and always reaches the agreed availability. At a certain point in time, the internet connection of the customer is interrupted and the availability falls under the defined SLA. But the interruption is not caused by a bad service of the ISP, but by some third party activities as for example road works, which accidentally cut a cable. In this case, the interruption should not influence the calculation of the SLA compliance, because the interruption was not the ISPs fault. To give most transparency to its customer, the ISP should have the possibility to exclude the occurred downtime from the SLA report and to add some written information for future tractability.
Our approach for problems of this kind is not to correct just the SLA report, but to correct the log entries themselves and to afterwards calculate the SLA report form the “correct” logs.
The Event Correction is divided into two different parts. The creation of the Event Correction and its application to the logs.
Creating an Event Correction requires relevant information like:
- Service (optional)
- Corrected Status
- Start Date
- End Date
The idea is to define a period, during which the current state is replaced with a new state, or a downtime is subsequently defined or removed.
Event Corrections can be created using the dedicated plug-in, or by using the links on the page to calculate the availability for Hosts/Service in thruk.
Once the Event Corrections exists, it is inserted into the log by adding new entries to it, or by replacing incorrect entries while maintaining the correct log structure. In this way, reports can be generated based on the original as well as on the corrected data.
Please note that in order to be able to view or manage such event corrections it is necessary to define the corresponding settings in the user profile.
Latest posts by Lukas Franceschini