Windows Desktop Automation Architect for AI Coding Agents

Name: Windows Desktop Automation Architect for AI Coding Agents
Price: 9.99 USD
Availability: InStock
Author: Agensi

Designs robust Windows desktop automation workflows using pywinauto, UI Automation, hotkeys, image matching, OCR, retries, logging, screenshots, and safety controls.

Updated May 2026

Convert manual Windows clicks into resilient pywinauto selector-based scripts.
Add screenshot diagnostics and retry logic to brittle desktop bots.
Automate legacy business software that lacks modern APIs or web interfaces.

Compatible with ChatGPT Custom GPTs

ChatGPT Agents

Cursor

Claude Code

Security scannedOne-time purchaseInstant install

$9.99

· or 50 credits

One-time purchase

30-day refund guarantee

Secure checkout via Stripe

Included in download

Convert manual Windows clicks into resilient pywinauto selector-based scripts.
Add screenshot diagnostics and retry logic to brittle desktop bots.
terminal, file_write, file_read automation included
Ready for Compatible with ChatGPT Custom GPTs

Sample Output

A real example of what this skill produces.

=== WINDOWS DESKTOP AUTOMATION ARCHITECTURE PLAN ===

Original request: Create a pywinauto automation architecture plan and AI coding agent prompt for a Windows desktop app that exports a monthly sales report.

Interpreted automation goal: Automate a Windows desktop workflow that opens the target app, navigates to Reports, selects a monthly date range, exports a sales report as an Excel file, handles the Save As dialog, and verifies the exported file.

Target application: Unspecified Windows desktop application

Workflow type: Report export automation

Assumptions:

The user is authorized to automate this app.
The app has a visible Reports section.
The export creates an Excel file.
pywinauto may be able to inspect the app through UIA or Win32.
A standard Windows Save As dialog may appear during export.

Unknowns to clarify:

application executable name
app technology: Win32, WPF, WinForms, Electron, Java, Qt, Citrix, or unknown
whether login is required
output folder
whether report controls expose stable properties
whether date controls are accessible
DPI scaling and screen resolution
whether the workflow runs locally or through Remote Desktop/Citrix

Better alternatives to GUI automation: Before GUI automation, check whether the app provides an API, scheduled export, command-line export, database report, built-in report scheduler, or file-based export configuration. If no reliable non-GUI method exists, desktop automation is appropriate.

Recommended automation strategy: Use pywinauto UI Automation first if controls are visible in the accessibility tree. Test the Win32 backend if UIA does not expose stable controls. Use stable selectors for the main window, report navigation, date fields, export button, and Save As dialog. Use hotkeys only after focus verification. Use image matching for inaccessible custom buttons. Use OCR only for status text or confirmation text when control text is unavailable.

Workflow map:

Launch or connect to the application.
Wait for the main window to exist and become visible.
Handle login or stop for secure manual login if required.
Navigate to Reports.
Select the monthly sales report.
Set the start and end date fields.
Verify the date range was accepted.
Trigger Export.
Wait for the Save As dialog.
Save the file to the configured output folder.
Wait for export completion.
Verify the Excel file exists and file size is greater than zero.
Log success and final output path.

Selector strategy:

Use process name or executable path for connection.
Use main window title regex instead of exact volatile titles.
Use automation ID, control type, class name, or stable text for controls.
Use parent-child hierarchy when needed.
Use standard Windows dialog selectors for Save As.
Avoid raw coordinates unless no selector, hotkey, image, or OCR alternative exists.

Hotkey strategy: Use hotkeys only when:

the active window is verified
focus is known
the shortcut is stable
the resulting state is verified

Possible fallback hotkeys:

documented Alt menu shortcuts
Ctrl+S only if it reliably triggers export/save
Tab navigation only with state verification, not blind repeated sequences

Image matching strategy: Use image matching only for inaccessible custom controls. Requirements:

restrict search to toolbar or report panel regions
define match threshold
capture templates at the same DPI and theme
validate expected state after clicking
capture screenshot if image target is not found

OCR strategy: Use OCR only for:

export status text
confirmation messages
error banners
report title verification if control text is unavailable

OCR requirements:

restrict OCR to the status/message region
define expected phrases such as Export complete or Report generated
use confidence thresholds
avoid capturing sensitive report content unless approved

Waits and synchronization:

wait for main window visible
wait for Reports screen loaded
wait for date fields enabled
wait for Export button enabled
wait for Save As dialog
wait for output file creation
wait for file size to stabilize
timeout each step with clear error messages

Error handling and recovery:

if app is not running, launch it
if wrong window is active, refocus the main window
if login is required, stop or request secure manual login
if a control is not found, retry and capture screenshot
if an unexpected popup appears, classify and stop or handle known popup
if export times out, retry once
if file already exists, apply overwrite policy
if OCR confidence is low, request manual confirmation
if failure persists, abort safely with logs

Logging and diagnostics: Log:

timestamp
workflow step
selector used
active window title
success or failure
elapsed time
retry count
output path
screenshot path on failure

Do not log:

passwords
tokens
secret values
sensitive report data
full screenshots containing private data unless approved

Safety controls:

dry-run mode up to the Export action
configurable output folder
overwrite protection
maximum retry limit
user confirmation before overwriting files
stop-on-error behavior
screenshot logging can be disabled for sensitive environments

Implementation structure: automation_project/ main.py config.py app_connection.py selectors.py workflow_steps.py image_matching.py ocr_utils.py logging_utils.py safety.py recovery.py tests/ logs/ screenshots/ README.md

AI coding agent prompt: Build a robust Windows desktop automation script in Python using pywinauto for exporting a monthly sales report from a desktop application. First determine whether the UIA or Win32 backend is more appropriate. Prefer stable selectors over coordinates. Create modular files for app connection, selectors, workflow steps, logging, recovery, image matching fallback, OCR fallback, safety controls, and configuration. Add explicit waits for windows, controls, enabled states, dialogs, export completion, and file creation. Use hotkeys only when the active window and resulting state can be verified. Use image matching only for inaccessible custom controls and OCR only for status verification when control text is unavailable. Add structured logs and screenshots on failure while avoiding sensitive data. Include dry-run mode, overwrite protection, retry limits, and manual confirmation for risky steps. Return automation strategy, selectors used, fallback strategy, how to run the script, testing plan, and known limitations.

Testing plan:

app closed before run
app already open before run
wrong window active
slow Reports screen loading
Save As dialog delayed
output file already exists
export succeeds
export times out
unexpected popup appears
different DPI scaling
screenshot logging disabled
dry-run mode enabled
Remote Desktop or Citrix environment if applicable

Known limitations: Reliability depends on target app technology, control accessibility, app version, Windows permissions, DPI scaling, resolution, language, theme, timing, and local versus remote execution.

Verification checklist:

app launches or connects successfully
main window is detected
Reports screen is reached
date range is set correctly
Export action is triggered only after state verification
Save As dialog is handled
Excel file exists
file size is greater than zero
logs are created
screenshot captured on failure when allowed
no credentials or sensitive report data are logged

Windows Desktop Automation Architect for AI Coding Agents

Designs robust Windows desktop automation workflows using pywinauto, UI Automation, hotkeys, image matching, OCR, retries, logging, screenshots, and safety controls.

Updated May 2026

Security scanned

One-time purchase

Compatible with ChatGPT Custom GPTs

$9.99

· or 50 credits

One-time purchase

30-day refund guarantee

Secure checkout via Stripe

⚡ Also available via Agensi MCP - your AI agent can load this skill on demand via MCP. Learn more →

Included in download

Convert manual Windows clicks into resilient pywinauto selector-based scripts.
Add screenshot diagnostics and retry logic to brittle desktop bots.
terminal, file_write, file_read automation included
Ready for Compatible with ChatGPT Custom GPTs
Instant install

Sample Output

A real example of what this skill produces.

=== WINDOWS DESKTOP AUTOMATION ARCHITECTURE PLAN ===

Original request: Create a pywinauto automation architecture plan and AI coding agent prompt for a Windows desktop app that exports a monthly sales report.

Target application: Unspecified Windows desktop application

Workflow type: Report export automation

Assumptions:

The user is authorized to automate this app.
The app has a visible Reports section.
The export creates an Excel file.
pywinauto may be able to inspect the app through UIA or Win32.
A standard Windows Save As dialog may appear during export.

Unknowns to clarify:

application executable name
app technology: Win32, WPF, WinForms, Electron, Java, Qt, Citrix, or unknown
whether login is required
output folder
whether report controls expose stable properties
whether date controls are accessible
DPI scaling and screen resolution
whether the workflow runs locally or through Remote Desktop/Citrix

Workflow map:

Launch or connect to the application.
Wait for the main window to exist and become visible.
Handle login or stop for secure manual login if required.
Navigate to Reports.
Select the monthly sales report.
Set the start and end date fields.
Verify the date range was accepted.
Trigger Export.
Wait for the Save As dialog.
Save the file to the configured output folder.
Wait for export completion.
Verify the Excel file exists and file size is greater than zero.
Log success and final output path.

Selector strategy:

Use process name or executable path for connection.
Use main window title regex instead of exact volatile titles.
Use automation ID, control type, class name, or stable text for controls.
Use parent-child hierarchy when needed.
Use standard Windows dialog selectors for Save As.
Avoid raw coordinates unless no selector, hotkey, image, or OCR alternative exists.

Hotkey strategy: Use hotkeys only when:

the active window is verified
focus is known
the shortcut is stable
the resulting state is verified

Possible fallback hotkeys:

documented Alt menu shortcuts
Ctrl+S only if it reliably triggers export/save
Tab navigation only with state verification, not blind repeated sequences

Image matching strategy: Use image matching only for inaccessible custom controls. Requirements:

restrict search to toolbar or report panel regions
define match threshold
capture templates at the same DPI and theme
validate expected state after clicking
capture screenshot if image target is not found

OCR strategy: Use OCR only for:

export status text
confirmation messages
error banners
report title verification if control text is unavailable

OCR requirements:

restrict OCR to the status/message region
define expected phrases such as Export complete or Report generated
use confidence thresholds
avoid capturing sensitive report content unless approved

Waits and synchronization:

wait for main window visible
wait for Reports screen loaded
wait for date fields enabled
wait for Export button enabled
wait for Save As dialog
wait for output file creation
wait for file size to stabilize
timeout each step with clear error messages

Error handling and recovery:

if app is not running, launch it
if wrong window is active, refocus the main window
if login is required, stop or request secure manual login
if a control is not found, retry and capture screenshot
if an unexpected popup appears, classify and stop or handle known popup
if export times out, retry once
if file already exists, apply overwrite policy
if OCR confidence is low, request manual confirmation
if failure persists, abort safely with logs

Logging and diagnostics: Log:

timestamp
workflow step
selector used
active window title
success or failure
elapsed time
retry count
output path
screenshot path on failure

Do not log:

passwords
tokens
secret values
sensitive report data
full screenshots containing private data unless approved

Safety controls:

dry-run mode up to the Export action
configurable output folder
overwrite protection
maximum retry limit
user confirmation before overwriting files
stop-on-error behavior
screenshot logging can be disabled for sensitive environments

Testing plan:

app closed before run
app already open before run
wrong window active
slow Reports screen loading
Save As dialog delayed
output file already exists
export succeeds
export times out
unexpected popup appears
different DPI scaling
screenshot logging disabled
dry-run mode enabled
Remote Desktop or Citrix environment if applicable

Verification checklist:

app launches or connects successfully
main window is detected
Reports screen is reached
date range is set correctly
Export action is triggered only after state verification
Save As dialog is handled
Excel file exists
file size is greater than zero
logs are created
screenshot captured on failure when allowed
no credentials or sensitive report data are logged

Security scanned

About This Skill

Windows Desktop Automation Architect helps AI coding agents, developers, automation builders, QA teams, operations teams, and business users design reliable Windows GUI automation for desktop applications that do not have easy APIs. It creates automation architecture plans, pywinauto prompts, manual workflow conversion specs, selector strategies, hotkey plans, image matching fallbacks, OCR fallback plans, retry and timeout strategies, logging and screenshot diagnostics, safety checklists, debugging prompts, and test plans. The skill is ideal for automating legacy Windows apps, ERP/CRM tools, report exports, file dialogs, data entry workflows, installer screens, remote desktop workflows, and repetitive internal business processes while avoiding brittle coordinate-only scripts.

📖 Learn more: Best Frontend & Design Skills for Claude Code →

Use Cases

Convert manual Windows clicks into resilient pywinauto selector-based scripts.
Add screenshot diagnostics and retry logic to brittle desktop bots.
Automate legacy business software that lacks modern APIs or web interfaces.
Design OCR and image-matching fallbacks for Citrix or Remote Desktop sessions.
Create safety-first workflows for sensitive data entry or file exports.
Audit coordinate based scripts and make them more reliable
Create Cursor prompts for robust Windows automation projects
Plan Citrix or Remote Desktop automation with OCR and image matching
Automate Save As dialogs and file export workflows
Build safety checklists for destructive desktop workflows
Add logging screenshots retries and recovery to automation scripts

Known Limitations

This skill creates Windows desktop automation plans, prompts, reliability strategies, and test checklists, but it does not guarantee that an automation will work reliably on every machine, app version, DPI setting, resolution, language, theme, permission level, Remote Desktop session, Citrix environment, or Windows configuration. GUI automation can be fragile when applications change, controls are inaccessible, timing varies, OCR confidence is low, image templates drift, or workflows involve sensitive or destructive actions. Production use requires real-environment testing, logging, recovery paths, user authorization, and human review for high-risk workflows.

How to Install

mkdir -p ~/.claude/skills && curl -sL https://www.agensi.io/api/install/windows-desktop-automation-architect-for-ai-coding-agents | tar xz -C ~/.claude/skills/

Free skills install directly. Paid skills require purchase - use the download button above after buying.

Reviews

No reviews yet - be the first to share your experience.

Only users who have downloaded or purchased this skill can leave a review.

Early access skill

Security scanned

Compatible with ChatGPT Custom GPTs, ChatGPT Agents, Curs…

Example output available

Be the first to review this skill.

Only users who have downloaded or purchased this skill can leave a review.

Security Scanned

Passed automated security review

Permissions

Terminal / Shell

Write Files

Read Files

Allowed Hosts

https://promptbase.com/profile/shandra?via=mhq19

File Scopes

*.md

*.txt

*.json

*.yaml

*.yml

*.py

src/**

scripts/**

automation/**

tests/**

docs/**

config/**

screenshots/**

templates/**

logs/**

requirements.txt

pyproject.toml

*.log

README.md

This skill uses file access to read user-provided automation scripts, workflow notes, logs, selector dumps, screenshot descriptions, configuration examples, README files, and project files. It uses write access to create structured Markdown/text outputs such as Windows automation architecture plans, pywinauto implementation prompts, manual workflow conversion specs, image matching and OCR plans, reliability audits, safety checklists, debugging prompts, testing plans, documentation, and SKILL.md files. Terminal access is optional and should only be enabled when the agent is expected to inspect or test local automation scripts. Browser or network access is only needed for external documentation research. Environment variable access is not normally required, and secret values should never be exposed.

Windows Desktop Automation Architect for AI Coding Agents

Included in download

Sample Output

Windows Desktop Automation Architect for AI Coding Agents

Included in download

Sample Output

About This Skill

Use Cases

Known Limitations

How to Install

Reviews

Permissions

Tags

Frequently Asked Questions

Learn More About AI Agent Skills

Windows Desktop Automation Architect for AI Coding Agents

Included in download

Sample Output

Windows Desktop Automation Architect for AI Coding Agents

Included in download

Sample Output

About This Skill

Use Cases

Use Cases

Known Limitations

Known Limitations

How to Install

How to Install

Reviews

Permissions

Tags

Frequently Asked Questions

How does this skill improve the reliability of desktop automation compared to standard scripts?

Which AI coding agents and development environments are compatible with this skill?

What exactly is included in the 'Architecture' package?

Do I need to install any heavy software or plugins to use this automation architect?

Can this handle legacy Windows applications that don't have a modern API?

How are updates handled if Windows or the underlying automation libraries change?

Learn More About AI Agent Skills