windows-desktop-automation
by Zicheng Liao
High-performance Windows automation using native UI trees, OpenCV image matching, and Tesseract OCR.
- Automate legacy Windows apps that lack APIs or web interfaces
- Extract text from non-selectable UI regions using high-accuracy OCR
- Perform multi-step GUI testing with native element identification
$15
One-time purchase
Included in download
- Extract text from non-selectable UI regions using high-accuracy OCR
- Perform multi-step GUI testing with native element identification
- terminal, browser automation included
- Includes example output and usage patterns
See it in action
Main Window 'Notepad' found (PID: 8422). UI Element 'Edit' focus: OK. Action: Typed 'Automation Report' via UIA. Action: Menu select 'File -> Save As' successful. OCR Verify: Found text 'Save As' at (450, 320) with 94% confidence. Task completed in 1.4s.
windows-desktop-automation
by Zicheng Liao
High-performance Windows automation using native UI trees, OpenCV image matching, and Tesseract OCR.
$15
One-time purchase
⚡ Also available via Agensi MCP — your AI agent can load this skill on demand via MCP. Learn more →
Included in download
- Extract text from non-selectable UI regions using high-accuracy OCR
- Perform multi-step GUI testing with native element identification
- terminal, browser automation included
- Includes example output and usage patterns
- Instant install
See it in action
Main Window 'Notepad' found (PID: 8422). UI Element 'Edit' focus: OK. Action: Typed 'Automation Report' via UIA. Action: Menu select 'File -> Save As' successful. OCR Verify: Found text 'Save As' at (450, 320) with 94% confidence. Task completed in 1.4s.
About This Skill
What it does
This skill provides a high-performance automation suite for Windows desktop applications. It leverages the native UI Automation (UIA) backend via pywinauto to interact directly with an application's accessibility tree, making it significantly faster and more reliable than traditional pixel-scanning tools.
Why use this skill
Unlike basic macro recorders, this developer-centric skill handles the complex reality of desktop automation. It provides a layered reliability model: if native UI elements are hidden, it falls back to OpenCV image matching; if text is non-selectable, it uses Tesseract OCR. This ensures your automations don't break when a window moves by a few pixels or a UI theme changes.
Supported Tools & Frameworks
- pywinauto: Native control interaction (buttons, menus, tree views, datagrids).
- OpenCV: Computer vision for custom-drawn interfaces and games.
- pytesseract: Optical Character Recognition for screen text extraction.
- pyautogui: Global hotkeys, mouse movement, and low-level input.
The Output
Expect robust execution of desktop tasks. Instead of fragile coordinate-based clicks, the skill generates scripts that wait for specific UI states, interact with elements by their internal IDs, and handle window focus automatically. The result is "set and forget" automation for legacy software, ERP systems, and desktop utilities.
Use Cases
- Automate legacy Windows apps that lack APIs or web interfaces
- Extract text from non-selectable UI regions using high-accuracy OCR
- Perform multi-step GUI testing with native element identification
- Create reliable hotkey macros and automated data entry workflows
- Interact with custom-drawn controls using OpenCV template matching
How to Install
unzip windows-desktop-automation.zip -d ~/.claude/skills/Reviews
No reviews yet — be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Early access skill
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
Allowed Hosts
File Scopes
Creator
Frequently Asked Questions
Learn More About AI Agent Skills
Similar Skills
code-reviewer
Reviews your code for bugs, security vulnerabilities, logic errors, performance issues, and style violations. Organizes findings by severity and suggests fixes with code examples.
git-commit-writer
Writes conventional commit messages by analyzing your staged git changes. Detects commit type, scope, and breaking changes automatically.
readme-generator
Generates a complete, polished README.md by scanning your actual project structure, dependencies, and code.
env-doctor
Diagnoses why your project will not start. Checks runtime versions, dependencies, environment variables, databases, ports, and build artifacts systematically.