Ops: Incidents

Public

Incidents are time-boxed, blameless, and action-oriented. Prioritize user impact and clear communication.

Severity Levels

  • SEV1: critical outage or security event; executive comms required.
  • SEV2: major degradation; workarounds available.
  • SEV3: localized or minor impact.

Communication

  1. Start an incident channel; assign an incident commander.
  2. Publish user-facing updates at a fixed cadence.
  3. Document decisions, mitigations, and timelines.

Timelines

Mitigate first, then investigate. Write a brief post-incident review within 48 hours with concrete follow-ups and owners.

Reminder

If safety is impacted (e.g., data exposure risk), escalate to Security immediately and follow breach procedures.

Related docs