1

    Incident Postmortem Generator

    by Kaymue

    Turn a 3am outage into a postmortem in 10 minutes. Slack/PagerDuty ingest, 5-Whys, blameless framing, action items. SEV1/2/3.

    Updated Jun 2026
    0 installs

    Free

    Included in download

    • Downloadable skill package
    • 1 permission declared
    • Instant install

    About This Skill

    # Incident Postmortem Generator You just survived a 4-hour outage. Everyone's exhausted. The PM wants a writeup by EOD. The CEO wants root cause. Legal wants a timeline. This skill takes your chaos and produces a blameless postmortem in 10 minutes. ## What it does Ingests raw incident data and produces a complete postmortem: - **Timeline reconstruction** — merges PagerDuty alerts, Slack messages, deploy events, status updates - **Root cause analysis** — 5-Whys + fault tree + contributing factors - **Impact quantification** — users affected, $ lost, SLA breach - **Action items** — owner-assigned, prioritized, with due dates - **Blameless framing** — flags blame-y language and rewrites it - **SEV1/2/3 templates** — different depth for different severity - **Blameless review** — focus on systems, not individuals ## When to use it - A SEV1 just happened and you need a postmortem by EOD - Your on-call team is burned out and skipping writeups - Incident reviews drag on for 2 hours because the timeline is wrong - Action items from past postmortems never get done (no owner) - You're scaling your incident response process - An auditor is asking "show me your postmortem process" ## Why it's better than ad-hoc prompting Most "write a postmortem" prompts give generic templates. This skill is different: - **Ingests your data** — Slack export, PagerDuty timeline, deploy log - **Auto-reconstructs** — doesn't make you re-type the timeline - **Blameless-aware** — actively flags and rewrites blame language - **Action items with owners** — not "we should..." but "@alice owns X by date Y" - **Severity-aware** — SEV1 has 12 sections, SEV3 has 4 ## Architecture ``` ┌─────────────────────────────────────────────────────────┐ │ Agent (Claude/Cursor) │ │ - Points at Slack export / PagerDuty log / timeline │ │ - Calls generator │ │ - Reviews for blame, action items │ └───────────────┬─────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ skills/incident-postmortem-generator/ │ │ scripts/ │ │ ├── ingest_slack.py # Parse Slack export │ │ ├── ingest_pagerduty.py # Parse PD timeline │ │ ├── ingest_deploys.py # Parse deploy log │ │ ├── build_timeline.py # Merge all sources │ │ ├── generate.py # Render postmortem MD │ │ ├── check_blameless.py # Flag blame language │ │ └── action_tracker.py # Export to Linear/Jira │ │ references/ │ │ ├── blameless-guide.md # How to write blameless │ │ ├── 5-whys.md # RCA technique │ │ ├── fault-tree.md # RCA technique │ │ ├── sev-templates.md # SEV1/2/3 templates │ │ └── action-items-guide.md │ │ templates/ │ │ ├── sev1.md.tmpl │ │ ├── sev2.md.tmpl │ │ └── sev3.md.tmpl │ └─────────────────────────────────────────────────────────┘ ``` ## Quick start ```bash # 1. Install pip install python-dateutil # 2. Ingest from various sources python scripts/ingest_slack.py --export slack-export-2026-06-20.json python scripts/ingest_pagerduty.py --incident PD-12345 python scripts/ingest_deploys.py --log deploys.log # 3. Merge into one timeline python scripts/build_timeline.py --out timeline.json # 4. Generate postmortem python scripts/generate.py --timeline timeline.json --sev 1 --out postmortem-2026-06-20.md # 5. Check for blame language python scripts/check_blameless.py postmortem-2026-06-20.md # 6. Export action items to Jira python scripts/action_tracker.py postmortem-2026-06-20.md --tracker jira ``` ## Sample output (excerpt) ```markdown # Postmortem: API Gateway Outage — 2026-06-20 **Severity**: SEV1 **Duration**: 4h 12m (14:32 UTC → 18:44 UTC) **Status**: Resolved **Customer impact**: ~23,000 users, 1.2M failed requests, est. $14k revenue loss **SLA**: 99.9% breached by 0.04% for the month ## Timeline (UTC) | Time | Event | Source | |------|-------|--------| | 14:32 | @alice on-call paged: "5xx rate 8%" | PagerDuty | | 14:35 | @alice acks, opens #inc-2026-06-20 | Slack | | 14:41 | Identified: API gateway OOMKilled | Slack | | 14:48 | @bob: "Scaled gateway from 4 to 8 pods" | Slack | | 15:10 | 5xx back to 0.3%, but elevated | Slack | | 15:30 | Root cause: memory leak in new auth middleware (deployed 14:28) | Slack | | 15:45 | @bob: "Rolled back deploy auth-middleware v2.3.1 → v2.3.0" | Deploy log | | 16:00 | 5xx back to 0.05% (normal) | Slack | | 16:30 | @alice: "Monitoring for 2h before closing" | Slack | | 18:44 | Incident closed | PagerDuty | ## Root cause A memory leak was introduced in `auth-middleware` v2.3.1 (deployed 14:28 UTC). The leak caused the API gateway pods to OOMKilled within 4 minutes of receiving traffic, cycling faster than the load balancer could route around. ### 5 Whys 1. **Why did the API fail?** Pods were OOMKilled 2. **Why were pods OOMKilled?** Memory leak in auth-middleware 3. **Why was there a memory leak?** New code didn't release session cache 4. **Why didn't we catch it?** Load test didn't include 24h soak test 5. **Why didn't we have a 24h soak test?** Our CI pipeline only does 5min load tests ### Contributing factors - No memory limits set on the new auth-middleware container - Canary deploy was 5% — too small to detect a slow leak - No automated rollback on memory growth rate - The deployment was 4 minutes before the incident — no time to detect pre-pager ## Impact - **Users affected**: ~23,000 (12% of daily active) - **Failed requests**: 1.2M - **Revenue lost**: ~$14,000 (subscription churn + transaction failures) - **SLA breach**: 99.9% monthly → 99.86% (over 24h, partial month) ## Action items | # | Action | Owner | Priority | Due | Status | |---|--------|-------|----------|-----|--------| | 1 | Add memory limit to all middleware containers | @bob | P0 | 2026-06-25 | Open | | 2 | Add 24h soak test to CI pipeline | @carol | P0 | 2026-07-05 | Open | | 3 | Increase canary deploy to 20% | @bob | P1 | 2026-07-10 | Open | | 4 | Auto-rollback on memory growth > 20%/h | @dave | P1 | 2026-07-15 | Open | | 5 | Add memory-leak detector to pre-commit hooks | @alice | P2 | 2026-07-30 | Open | | 6 | Document postmortem process in runbook | @alice | P2 | 2026-08-01 | Open | ## What went well - Fast ack time (3 min from page to incident channel) - Quick identification of root cause - Clean rollback, no data loss ## What went poorly - Memory limit missing on new container - Canary too small to detect slow leak - No automated rollback on memory anomalies ``` ## Pricing Single-purchase, lifetime access. $8.50. Includes: - 7 Python scripts (4 ingest + 1 generate + 1 check + 1 tracker) - 5 reference docs (blameless, 5-whys, fault tree, SEV templates, actions) - 3 SEV templates (SEV1/2/3) - Slack/PagerDuty/Jira integration code - Future updates for the same major version ## Example usage > "We had a 4-hour SEV1 yesterday. Here's the Slack export and PagerDuty incident ID. Generate the postmortem." The skill will: 1. Parse Slack and PagerDuty data 2. Reconstruct timeline 3. Run 5-Whys + fault tree 4. Generate postmortem with action items 5. Flag any blame language 6. Export action items to your tracker ## Compatibility Works with any agent that supports the SKILL.md standard and can execute Python: Claude Code, OpenClaw, Codex CLI, Cursor, Gemini CLI, Cline, Windsurf, Aider. Slack export: free. PagerDuty: API token. Jira/Linear: API token. Tested on Linux, macOS, Windows. ## Tags sre, incident-response, postmortem, devops, on-call, monitoring, reliability

    Use Cases

    • Turn a chaotic 3am outage into a structured postmortem in 10 minutes. Ingests timeline + logs + chat, produces a blameless postmortem with root cause analysis, contributing factors, and tracked action items. Templates for SEV1/2/3.

    Reviews

    No reviews yet - be the first to share your experience.

    Only users who have downloaded or purchased this skill can leave a review.

    Security Scanned

    Passed automated security review

    Permissions

    Terminal / Shell

    Works with any agent that supports the universal SKILL.md standard

    Creator

    Frequently Asked Questions

    More Premium Skills

    Free