28. 10. 2020 Nicola Degara Business Service Monitoring, Cloud, Icinga Web 2, NetEye, Service Management, SLM, Unified Monitoring

NetEye Incident Response with Atlassian Opsgenie Integration

As an Atlassian partner we are working in this period on the preparation of an online demo system in parallel with the delivery of our new NetEye demo online service. For the occasion I decided to expand our already existing NetEye integration with the Atlassian ecosystem that you will find described in other previous blogs.

With this article, I would like to introduce the new integration between NetEye and the Opsgenie Cloud Solution.

Atlassian defines Opsgenie as “a modern incident management platform that ensures critical incidents are never missed, and actions are taken by the right people in the shortest possible time. Opsgenie receives alerts from your monitoring systems and custom applications and categorizes each alert based on importance and timing. On-call schedules ensure the right people are notified through multiple communication channels including voice calls, email, SMS, and push messages on mobile devices. If an alert is not acknowledged, Opsgenie automatically escalates it, ensuring the incident gets the needed attention.

The meaning of this integration

I personally think that Opsgenie is Atlassian’s answer to the Incident Response concept: it will help you to organize teams and resources for major incident resolution and stakeholder engagement!

With this new integration, you can now:

  • Send notification alarms from NetEye to the Opsgenie Incident Dashboard
  • Ack and un-Ack an Incident from Opsgenie  back to NetEye
  • Add comments to NetEye directly from the Opsgenie Incident Dashboard
  • Synchronize NetEye Business Process with Opsgenie Team Services
  • … and of course use all the feature available in Opsgenie to manage the events lifecycle received from NetEye!

One more thing: the integration between NetEye and Opsgenie runs completely over port 443!

A sample Use Case

Thanks to this integration, it is possible to implement a common, extended Incident Response process from NetEye Monitoring Events generation up to eventual incident resolution and stockholder engagement.

Thanks to the functionalities available for Stakeholder engagement, I believe that one concrete value is related to the possibility of sending real time custom messages to end users regarding a degraded service reported by NetEye: here is an example of the first message that I, as a stakeholder in my test, received on my smartwatch:

F5C61423-6CF4-4930-B8B3-E202C7A7B0E9

Pretty cool, right?

Integration results screenshots

NetEye events are available for browsing through the Opsgenie Alert Dashboard
The integration sends all event information from NetEye to Opsgenie that may be used to investigate the events more easily 
Alerts from NetEye may be converted into Opsgenie Incidents for further investigation by the team
Alerts in NetEye may be acked or un-acked directly from the Opsgenie GUI, and the result is updated directly in NetEye
All the activities (ack/comments/events/status) are stored in the history of Service/Hosts
An Opsgenie Incident can be linked with Jira Service Desk incident requests thanks to a new ITSM template

Interface installation notes

In order to implement the integration we can easily follow this online Icinga2/Opsgenie how-to guide: https://docs.Opsgenie.com/docs/icinga2-integration, but be careful because some custom configurations are required since you are installing the integration inside a NetEye environment.

Here I would like to list some important differences and sections you need to be aware of:

  • API
    Depending on the data location you’ve selected during the Opsgenie’s cloud instance creation be sure you are using the right API server address: if you created an instance in Europe, the correct API address is https://api.eu.Opsgenie.com. For a U.S. instance you can use the default API address (https://api.Opsgenie.com). This particular condition cost me a lot of time debugging J

The head of your file /home/Opsgenie/oec/conf/config.json should look like:

{
  "apiKey": "[OPSGENIE ICINGA2 INTEGRATION KEY]",
  "baseUrl": "https://api.eu.Opsgenie.com",
  "logLevel": "DEBUG",
  "globalArgs": [],
  "globalFlags": {
    "graphite_url": "http://localhost:5003",
    "api_url": "https://localhost:5665",
    "user": "root",
    "password": "[Icinga2 API root password]",
    "insecure": "false"
  • Configuration notes
    The file /home/Opsgenie/oec/Opsgenie-icinga2/Opsgenie.conf that contains Notification Commands should be placed in /neteye/shared/icinga2/conf/icinga2/conf.d/Opsgenie.conf

This will let you use the NetEye Director KickStart Wizard tool (https://YOUR-NETEYE-FQDN/neteye/director/dashboard?name=infrastructure#!/neteye/director/kickstart) and automatically import the command into External Commands.

Before you run the Wizard, assign the correct Icinga permissions to this file, and I also suggest that you comment out the user creation node in the Opsgenie.conf file as you may have problems with the import. So comment out the following lines and create the User Opsgenie manually in NetEye.

# object User "Opsgenie" {
#    import "generic-user"
#     display_name = "Opsgenie Contact"
# }

Now you can run the wizard and deploy your changes!

  • NetEye External commands
    Once you’ve checked that there are two new External Commands in your NetEye: 

you should create two related notifications. Here are the two notifications to be imported with the icingacli using this command (with json below):


icingacli director notification create –key <notification-name>’ –json ‘<json>’

template Notification "Opsgenie-host-notification" {
    import "generic notify all host events"
    command = "Opsgenie-host-notification"
    period = "24x7"
}
template Notification "Opsgenie-service-notification" {
    import "generic notify all events service"
    command = "Opsgenie-service-notification"
    period = "24x7"
}
  • NetEye Business Process – OpsGenie Team’s Service synchronization

In order to synchronize the Business Process with NetEye we have check ready script… Here an example of my quick – but working- bash test:

#!/bin/sh
#
# NETEYEDEMO
# DENI 28.10.2020
#
#  this script is just a test .. it currently does not handle BP with spaces
# 
opsgenieAPIURL="https://api.eu.opsgenie.com/v1/services"
opsgenieapitoken="(OpsGenie-Token)" 
TeamId="(Team's id)" # it can be retrieved with API but this is faster :-) 

## GET BP
echo " "
echo " Get BP processes from NetEye"
echo " "

hosts=$(icingacli businessprocess process list )
neteyearray=()
for row in $hosts; do
    if [[ $row =~ ^\( ]];then
        continue
     fi
    echo "NetEye Business Process: " $row
    neteyearray+=("$row")
done


echo " "
echo " Get services from OpsGenie"
echo " "

## GET SERVICES FROM OPSGENIE
opsgeniehosts=$(curl -s -X GET "$opsgenieAPIURL" -H "Authorization: GenieKey $opsgenieapitoken" -H "Content-Type: application/json" | jq '.data[].name' |  sed "s/\"//g" )
opsgeniearray=()
for opsgenierow in $opsgeniehosts; do
    echo "OpsGenie Service: " $opsgenierow
    opsgeniearray+=("$opsgenierow")
done


## GET DIFFERENCES BETWEEN TWO ARRAYS TO AVOID DUPLICATES
HostDiff=()
for i in "${neteyearray[@]}"; do
    skip=
    for j in "${opsgeniearray[@]}"; do
        [[ $i == $j ]] && { skip=1; break; }
    done
    [[ -n $skip ]] || HostDiff+=("$i")
done


# CREATE ONLY NEW SERVICE IN OPSGENIE
hostlist=""
for i in "${HostDiff[@]}"
do
           json='{"teamId": "'$TeamId'","name": "'$i'", "tags": ["neteye","business process"] }'
           curl -X POST  "$opsgenieAPIURL" -H "Authorization: GenieKey $opsgenieapitoken" -H "Content-Type: application/json" --data "$json" -v   
           hostlist+=" ( $i ) "
done

if [ -n "$hostlist" ]
then
  echo "OK -  syncronization succesfully executed: created or updated objects: [$hostlist] "
else
  echo "OK -  syncronization succesfully executed: no new hosts have been created or updated"
fi
exit 0

# you can schedule this check once a day and let NetEye handle the sync

And this is the result!

  • Running oec
    As described in the online guide, please check that the oec service is correctly working on your NetEye:
    • systemctl status oec (if stopped, start it!) 
    • journalctl -u oec -f
    • tail -f /home/Opsgenie/oec/output/output.txt 
      (during some ack-unack operations from Opsgenie you should see some debug information, as by default in Opsgenie.conf the log level is debug mode.) 
Nicola Degara

Nicola Degara

Technical Consulting and Delivery Team Manager at @ Würth Phoenix
My name is Nicola Degara and I work as Consulting and Delivery Team Leader. With a technical developer background, I am in the IT field since the nineties. After more than 6 years’ experience as director of an international software development company in Shanghai I have been embracing, once back to Europe, the dynamic NetEye experience and philosophy in combination with international Atlassian product projects. My strong conviction towards the Open Source supports and continues to influence any reasons of my daily choices and future visions

Author

Nicola Degara

My name is Nicola Degara and I work as Consulting and Delivery Team Leader. With a technical developer background, I am in the IT field since the nineties. After more than 6 years’ experience as director of an international software development company in Shanghai I have been embracing, once back to Europe, the dynamic NetEye experience and philosophy in combination with international Atlassian product projects. My strong conviction towards the Open Source supports and continues to influence any reasons of my daily choices and future visions

Leave a Reply

Your email address will not be published. Required fields are marked *

Archive