windows-desk-automation
by Roy Yuen
Reliable UIA-based Windows desktop automation with OCR and image matching fallbacks.
- Automate repetitive data entry in legacy Win32 ERP systems
- Perform end-to-end GUI testing for native Windows desktop applications
- Scrape data from desktop apps that lack API or web interfaces
$9
One-time purchase
Included in download
- Perform end-to-end GUI testing for native Windows desktop applications
- Scrape data from desktop apps that lack API or web interfaces
- terminal automation included
- Includes example output and usage patterns
See it in action
SUCCESS: Automated 'Notepad' save workflow.
- Found Document: 'Edit' (UIA Object)
- Action: set_text('Build log initialized')
- Action: hotkey('ctrl+s')
- Verification: Found 'Save As' dialog window.
- Asset: Saved file 'C:\Logs\init.txt' exists.windows-desk-automation
by Roy Yuen
Reliable UIA-based Windows desktop automation with OCR and image matching fallbacks.
$9
One-time purchase
⚡ Also available via Agensi MCP — your AI agent can load this skill on demand via MCP. Learn more →
Included in download
- Perform end-to-end GUI testing for native Windows desktop applications
- Scrape data from desktop apps that lack API or web interfaces
- terminal automation included
- Includes example output and usage patterns
- Instant install
See it in action
SUCCESS: Automated 'Notepad' save workflow.
- Found Document: 'Edit' (UIA Object)
- Action: set_text('Build log initialized')
- Action: hotkey('ctrl+s')
- Verification: Found 'Save As' dialog window.
- Asset: Saved file 'C:\Logs\init.txt' exists.About This Skill
Professional Windows Desktop Automation
This skill enables your AI agent to reliably control native Windows applications using a robust, automation-first approach. Unlike simple macro recorders or vision-only tools, this skill leverages professional-grade UI Automation (UIA) frameworks to interact directly with application objects, ensuring high reliability and speed.
What it does
- Object-Based Control: Interacts with Windows UI elements using automation IDs, control types, and class names via pywinauto.
- Intelligent Fallbacks: Automatically switches to OCR or image matching only when standard UIA metadata is unavailable.
- Deterministic Workflows: Performs precise actions like text entry, menu navigation, and state assertions rather than relying on brittle coordinate-based clicks.
- Multi-App Support: Works with standard Win32, WPF, Qt, and modern .NET applications.
Why use this skill?
Manual prompt-based automation often fails because LLMs struggle with window handles, DPI scaling, and hidden UI hierarchies. This skill provides a structured framework that first inspects the application's underlying control tree to build a "plan" before execution. It handles the low-level complexities of process attachment, admin elevation detection, and state verification, delivering a level of reliability that simple scripting cannot match.
Advanced Capabilities
- Full UIA tree dumping for selector discovery.
- Hotkey-driven navigation for standard Windows shortcuts.
- OCR-based location for custom-rendered canvases.
- Integrated verification steps to confirm UI states post-action.
📖 Learn more: Best Testing & QA Skills for Claude Code →
Use Cases
- Automate repetitive data entry in legacy Win32 ERP systems
- Perform end-to-end GUI testing for native Windows desktop applications
- Scrape data from desktop apps that lack API or web interfaces
- Create hotkey-driven workflows for complex creative software tasks
How to Install
unzip windows-desk-automation.zip -d ~/.claude/skills/Reviews
No reviews yet - be the first to share your experience.
Only users who have downloaded or purchased this skill can leave a review.
Early access skill
Be the first to review this skill.
Only users who have downloaded or purchased this skill can leave a review.
Security Scanned
Passed automated security review
Permissions
File Scopes
Frequently Asked Questions
Learn More About AI Agent Skills
Similar Skills
code-reviewer
Reviews your code for bugs, security vulnerabilities, logic errors, performance issues, and style violations. Organizes findings by severity and suggests fixes with code examples.
git-commit-writer
Writes conventional commit messages by analyzing your staged git changes. Detects commit type, scope, and breaking changes automatically.
readme-generator
Generates a complete, polished README.md by scanning your actual project structure, dependencies, and code.
env-doctor
Diagnoses why your project will not start. Checks runtime versions, dependencies, environment variables, databases, ports, and build artifacts systematically.