Incident Report
Incident Report: [INC-XXX] [Short Title]
Date: YYYY-MM-DD Severity: P1 Critical | P2 High | P3 Medium | P4 Low Status: Investigating | Mitigating | Resolved | Post-Mortem Complete Owner: [Name/Agent] Duration: [Start time] — [End time] ([X hours])
1. Summary
[1-2 sentence description of what happened and the impact]
2. Timeline
| Time | Event |
|---|---|
| HH:MM | Incident detected — [how] |
| HH:MM | Investigation started |
| HH:MM | Root cause identified |
| HH:MM | Fix deployed |
| HH:MM | Incident resolved |
| HH:MM | Monitoring confirmed stable |
3. Impact
- Users affected: [Number/percentage]
- Services affected: [List]
- Data loss: [Yes/No — details]
- Duration: [X hours/minutes]
- Financial impact: [If applicable]
4. Root Cause
[What actually caused the incident. Be specific — not "human error" but "configuration file had incorrect database connection string because of merge conflict in PR #123"]
5. Resolution
[What was done to fix the issue]
- [Step 1]
- [Step 2]
- [Step 3]
6. What Went Well
- [Thing that worked during incident response]
- [Thing that helped reduce impact]
7. What Went Wrong
- [Thing that contributed to the incident]
- [Thing that slowed resolution]
8. Action Items
| # | Action | Owner | Due Date | Status |
|---|---|---|---|---|
| 1 | [Preventive action] | ☐ | ||
| 2 | [Process improvement] | ☐ | ||
| 3 | [Monitoring improvement] | ☐ |
9. Lessons Learned
[Key takeaways that should inform future work]
10. Approvals
| Role | Name | Date | Reviewed |
|---|---|---|---|
| Tech Lead | ☐ | ||
| John | ☐ |