Incident Escalation

From KynetxDocs

Jump to: navigation, search

Contents

Incident Severity Classification

  • Severity 1 (Sev1) - Complete loss of functionality of the Kynetx Network Service. Should a Sev1 incident be declared, the IT Operations team on-call will follow the incident escalation ladder below. A Command and Control (C&C) conference bridge will be established in order to coordinate the triage efforts. The Mean Time To Repair (MTTR) goal for a Sev1 incident is < 2 hours
  • Severity 2 (Sev2) - Severely degraded performance or functionality within the Kynetx Network Services. A Sev2 incident will be declared in cooperation with the client. The Kynetx IT Operations team will work to resolve all Sev2 incidents within a MTTR goal of < 8 hours. Should s Sev2 be escalated to a Sev1, it will require the approval of the Kynetx VP, Ops and Engineering, CTO or CEO
  • Severity 3 (Sev3) - Mildly degraded performance or functionality within the Kynetx Network Services. A Sev3 incident will be worked as time allows, but will be closed within a MTTR of < 48 hours from the time of declaration.

Incident Escalation Ladder

All incident reporting should correspond to the following incident escalation ladder:

Role Timeline Notes
Account Representative Immediate Ideally customers contact their account representative to report problems. If the customer contacts someone further up, or outside the ladder, that person should notify anyone below him in the ladder of the incident.
IT Operations engineer Immediate The operations engineer for the affected service.
VP, Ops and Engineering After 30 minutes Any incident that has not been cleared should be reported by phone or SMS.
Chief Technology Officer After 90 minutes Any incident that has not been cleared should be reported by phone os SMS.
Chief Executive Officer After 120 minutes Any incident that has not been cleared should be reported by phone os SMS.

The account representative is responsible for ensuring customers are kept appraised of incident status and should follow up with customers after the incident to gather feedback.

The operations engineer is responsible for filing the after-incident report and coordinating after-incident follow-up inside the company.

All customers will be given specific contact information for their account representative, an alternate, and the CTO and CEO.

Personal tools