1

    video-analyzer

    Transform raw video files into structured JSON and Markdown reports with local character detection and transcription.

    Updated Jun 2026
    0 installs

    Free

    Included in download

    • Downloadable skill package
    • 1 permission declared
    • Instant install

    Sample input

    Analyze latest_demo.mp4. I need to know who appears in it, what they said, and a summary of the visual timeline for my project report.

    Sample output

    Analysis complete. I've generated `latest_demo_report.md` and `latest_demo_report.json`. Key Highlights: - **Characters:** 2 (Presenter, Interviewee) - **Transcript:** Discussing the new API v2 features. - **Visuals:** Starts with screen share of terminal, transitions to 2-person split view.

    About This Skill

    Convert Videos into Searchable, AI-Readable Intelligence

    The Video Analyzer skill bridges the gap between raw video footage and your AI workflows. Many LLMs struggle with direct video uploads or offer limited context windows for long files. This skill solves that by locally processing video files into structured Markdown and JSON reports that capture every essential detail.

    What it does

    This developer-focused tool performs deep multi-modal analysis locally on your machine. It follows a rigorous workflow to decompose video content:

    • Automated Pre-processing: Extracts frames and audio using ffmpeg and opencv.
    • Speech-to-Text: Generates high-accuracy transcripts using OpenAI's Whisper.
    • Computer Vision: Detects distinct characters, identifies visual changes, and generates a visual timeline of events.
    • Structured Output: Compiles metadata, transcriptions, and visual descriptions into a final schema-ready JSON and a human-readable Markdown report.

    Why use this skill

    Prompting an AI to "watch a video" often leads to hallucinations or missing segments. This skill uses deterministic local processing to ensure nothing is missed. It is entirely private—processing happens via local subprocesses with no outbound network calls. The output is perfectly formatted for RAG (Retrieval-Augmented Generation) or for use as context in long-form coding and analysis tasks.

    Supported tools

    Built for efficiency, it leverages ffmpeg for media handling, Whisper for audio, and OpenCV for frame analysis. It supports all major formats including .mp4, .mov, .avi, .mkv, and .webm.

    Use Cases

    • Convert video meetings into structured, searchable documentation
    • Extract visual timelines and transcripts for AI-assisted video editing
    • Generate comprehensive metadata for local video libraries and RAG pipelines
    • Identify and count character appearances in cinematic or security footage

    Reviews

    No reviews yet - be the first to share your experience.

    Only users who have downloaded or purchased this skill can leave a review.

    Security Scanned

    Passed automated security review

    Permissions

    Terminal / Shell

    File Scopes

    video-analyzer/**

    Compatible with SKILL.md-compatible agents including Claude Code and Cursor.

    Frequently Asked Questions

    Free