Implementing an effective alert management system is about 90% planning and 10% execution. The keys are to properly identify and classify events that generate alerts, prioritize the alerts so they are received by the right people at the right time, and ensure that the individuals configured to receive alerts are skilled and equipped to directly respond to the event that generated the alert.

Prioritizing Alerts

The first task in implementing alert management is classifying the events that generate alerts. One possible classification methodology is a simple high-medium-low priority system that would generate Critical, Warning, and Informational alerts.

  • High priority or “Critical” alert would include any loss of service or events that may imminently lead to a loss of service.
  •  Medium priority or “Warning” alert would include events of significance, but not directly resulting in a loss of service.
  • A low priority or “Informational” alert would include events that do not result in a loss of service, and the response is not time critical.

The purpose of an alert is to generate some action in response to an event. Events that do not require a response should never generate an alert.  In addition, the response may need to be completed in a specified time period, or before another event occurs.

Every event that generates an alert should have a defined response, and if time-critical, the constraints for that time-criticality should also be specified.

If you think of alerts as described above, then matching events to alerts will generally be pretty straight forward.

Assigning Alerts

The next task in alert management is defining the criteria for assigning alerts. The criteria may vary based on the alert type, the event source, or perhaps simply the day/time when the alert was generated. The key here is that a person who receives an alert must be capable of and empowered to respond to the alert in the manner required.

Generally speaking, an alert should be sent to only one person. Sending alerts to multiple people may unnecessarily complicate the response to the event. The most difficult part of this task is managing the scheduling of alert recipients.

Assigning alerts should also include two required administrative actions on the part of the assigned person:

  •  Acknowledging receipt of the alert within a specified time period.
  • Confirming completion of the required event response within a specified time period.

Escalating Alerts

Alert escalation is the process of involving additional people in an event response when:

  • Receipt of the alert has not been acknowledged within a specified time period.
  • Confirmation of completion of the required event response has not occurred within a specified time frame.

Alert escalations should be implemented as an automated response by your alert management system. Every alert should have a pre-defined escalation response.