1

    Codebase Archaeology

    by Kaymue

    Reverse-engineer unfamiliar code. Dependency map, dead code, risk hotspots, onboarding guide. Survive inheriting 200k lines.

    Updated Jun 2026
    0 installs

    Free

    Included in download

    • Downloadable skill package
    • 1 permission declared
    • Instant install

    About This Skill

    # Codebase Archaeology You just inherited 200,000 lines of code. The author left. There are no docs. The CEO wants a new feature by Friday. This skill turns "where do I even start" into a structured 2-week onboarding plan. ## What it does A systematic reverse-engineering workflow for any codebase: - **Dependency map** — module graph, circular deps, fan-in / fan-out metrics - **Dead code report** — unused exports, unreachable functions, orphaned files - **Risk hotspots** — files that change often + are complex + lack tests - **Conventions detector** — what naming/structure does this codebase actually use - **Hidden entry points** — scripts, cron jobs, CLI tools, undocumented APIs - **Onboarding guide generator** — README, ARCHITECTURE.md, CONCEPTS.md - **"Where do I change X"** — for a feature request, identifies all touchpoints ## When to use it - You just joined a team and need to ramp up fast - You inherited a legacy codebase with no docs - You need to estimate the cost of a refactor - You want to find dead code to delete (or test gaps to fill) - You need to onboard a new hire - You're auditing a codebase before acquisition ## Why it's better than ad-hoc prompting Most "explain this codebase" prompts produce surface-level summaries. This skill is different: - **Quantitative** — every module gets a score (complexity, coupling, churn) - **Actionable** — outputs a prioritized 2-week plan, not just docs - **Visual** — generates interactive dependency graphs (Mermaid) - **Comprehensive** — covers 12 dimensions, not just "what does it do" - **Cumulative** — second run shows what's changed since first ## Architecture ``` ┌─────────────────────────────────────────────────────────┐ │ Agent (Claude/Cursor) │ │ - Points at a codebase │ │ - Runs archaeology scripts │ │ - Synthesizes findings + onboarding plan │ └───────────────┬─────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────┐ │ skills/codebase-archaeology/ │ │ scripts/ │ │ ├── dependency_map.py # Import graph + cycles │ │ ├── dead_code.py # Unused exports, funcs │ │ ├── hotspots.py # Churn × complexity │ │ ├── conventions.py # Style + pattern detection │ │ ├── entry_points.py # Scripts, cron, CLI │ │ ├── onboarding_gen.py # README, ARCHITECTURE │ │ └── feature_locator.py # "Where do I add X?" │ │ references/ │ │ ├── onboarding-plan.md │ │ ├── hotspot-playbook.md │ │ └── dead-code-policy.md │ │ templates/ │ │ ├── ARCHITECTURE.md.tmpl │ │ └── CONCEPTS.md.tmpl │ └─────────────────────────────────────────────────────────┘ ``` ## Quick start ```bash # 1. Install pip install networkx radon lizard pydeps mccabe # 2. Generate dependency map python scripts/dependency_map.py ./src --format mermaid > docs/architecture.md # 3. Find dead code (Python) python scripts/dead_code.py ./src --language python # 4. Risk hotspots python scripts/hotspots.py ./src --since "1 year ago" # 5. Detect conventions python scripts/conventions.py ./src # 6. Find entry points python scripts/entry_points.py . # 7. Generate onboarding guide python scripts/onboarding_gen.py ./src --output docs/ # 8. "Where do I add a new feature?" python scripts/feature_locator.py ./src "user authentication" ``` ## Sample onboarding output (excerpt) ``` # Codebase Onboarding Plan — 2 weeks ## Day 1-2: Reconnaissance - [ ] Read README.md (auto-generated) - [ ] Review ARCHITECTURE.md (auto-generated) — focus on: - Module structure (3 layers: api → service → data) - 3 main domains: users, billing, reports - [ ] Skim 5 most-imported files (top of dependency map) - [ ] Run the test suite once to know the baseline ## Day 3-4: Hotspot familiarization - [ ] Open top 5 hotspot files (most changed + most complex) - [ ] Read their tests — they encode the team's expectations - [ ] Note the 3 "load-bearing" modules (high fan-in, low churn) ## Day 5-7: Make your first change (in test branch) - [ ] Add a feature in the simplest module - [ ] Run lints, tests, type checks - [ ] Open a PR — observe review feedback patterns - [ ] Update ARCHITECTURE.md with what you learned ## Day 8-10: Tackle a small bug - [ ] Pick a low-priority issue - [ ] Use feature_locator.py to find touchpoints - [ ] Make the fix, add a regression test - [ ] Note any "weird" code that needs explaining ## Day 11-14: Write your "I just joined" doc - [ ] 3 things that surprised you - [ ] 3 things that are broken-but-intentional - [ ] 3 things you'd refactor given time - [ ] Add to CONCEPTS.md (auto-updated each run) ``` ## The 12 dimensions analyzed 1. **Module structure** — top-level layout, layer count, domain boundaries 2. **Dependency graph** — module imports, cycles, fan-in/fan-out 3. **Dead code** — unused exports, unreachable functions, orphan files 4. **Risk hotspots** — files with high churn AND high complexity 5. **Test coverage** — line + branch, gap analysis 6. **Style conventions** — naming, formatting, file structure 7. **Error handling** — exception patterns, error codes, retry logic 8. **Concurrency model** — threads, async, locks, actors 9. **External integrations** — APIs, DBs, queues, third-party libs 10. **Configuration** — env vars, config files, secrets 11. **Entry points** — main(), CLIs, cron, message handlers, webhooks 12. **Documentation gaps** — public functions without doc comments ## Pricing Single-purchase, lifetime access. $9.00. Includes: - 7 Python archaeology scripts - 3 reference docs (onboarding plan, hotspot playbook, dead-code policy) - 2 templates (ARCHITECTURE.md, CONCEPTS.md) - Sample analysis of a real open-source project - Future updates for the same major version ## Example usage > "I'm joining a team next week. They have a 200k line Python/TypeScript monorepo with no docs. Give me a 2-week onboarding plan." The skill will: 1. Run all 12 dimensions 2. Generate ARCHITECTURE.md from real data (not vibes) 3. Identify the 5 "must-understand" files 4. Output a day-by-day plan 5. Save findings to `docs/` for future team members ## Compatibility Works with any agent that supports the SKILL.md standard and can execute Python: Claude Code, OpenClaw, Codex CLI, Cursor, Gemini CLI, Cline, Windsurf, Aider. Supports Python, TypeScript, Go (full); Rust, Java (partial). Requires Git for hotspot/churn analysis. Tested on Linux, macOS, Windows. ## Tags code-analysis, refactoring, documentation, onboarding, technical-debt, legacy, code-quality

    Use Cases

    • The "I just inherited 200k lines of code from someone who left" survival kit. Reverse-engineers any codebase into dependency maps, dead-code reports, risk hotspots, and an onboarding guide. Saves 2-3 weeks of tribal-knowledge gathering.

    Reviews

    No reviews yet - be the first to share your experience.

    Only users who have downloaded or purchased this skill can leave a review.

    Security Scanned

    Passed automated security review

    Permissions

    Terminal / Shell

    File Scopes

    scripts/**

    Works with any agent that supports the universal SKILL.md standard

    Creator

    Frequently Asked Questions

    More Premium Skills

    Free