Automate mouse/keyboard actions by detecting on-screen images. This repo ships a PyQt6 GUI to manage templates, create sequences of actions, and configure a visual failsafe that can interrupt or be tested on demand. Built for personal automation, UI testing, and tinkering.
- GUI Tabs: Sequences, Failsafe, Templates, Template Tester, and Recorder
- Template Management: Create, preview, capture from screen, or load from file
- Sequence Editor: Add steps that find a template and then run actions
- Action Types:
click,right_click,double_click,move,move_to,type,key_press,wait,scroll,click_and_hold,play_recording - Regions & Randomization:
- Step-level
search_regionfor finding templates - Click actions can target a random point in a selected region
- Move-To actions can use a selected region with optional random movement
- Step-level
- Failsafe System: Enable a template-based trigger, define a separate sequence, and test it with a button
- Template Tester: Live preview of template matching with confidence meter and optional search region
- Overlay Previews: One-click visual overlays to preview current search regions and click points (
Show Region,Show Clickbuttons across editors). - Recorder Tab: Record global mouse/keyboard input with timing, delete events, save named recordings to
recordings/, preview overlays without executing, and play recordings as actions in sequences/groups/failsafe. - Hotkeys & Status: F8 to stop; status bar updates; mouse position tracker
- Config Persistence: Reads/writes
config.jsonand keeps template paths relative when possible - Break Settings: Set a maximum runtime cap (hours/minutes/seconds); live timer shows
Elapsed / Max; cap overrides loop settings
- Desktop GUI (PyQt6): author templates, sequences, groups, failsafe; run and observe step progress.
- Web Server: lightweight HTTP server with editors and a live MJPEG preview; mirrors most GUI features.
- IPC Bridge: the web server writes commands to
ipc_command.json; the GUI polls and acts (run/stop/reload/etc.). - Launcher (Python/Tk + PowerShell): unified start/stop/open; can edit and save
server_config.jsonbefore launch. - Configuration:
config.jsonstores templates, sequences, failsafe, groups, schedules, break settings. - Assets:
images/,failsafe_images/,web/static/.
-
Overlay previews in the GUI
- Added a transparent, non-interactive overlay window used to preview coordinates.
- New buttons:
Show Regionin Step Editor, Action Editor, and Failsafe sidebar;Show Clickin Action Editor. - Previews draw a red rectangle for regions and a crosshair marker for click points; overlays auto-close after a short timeout.
-
Template reload robustness and debug toggle fix
- Template list is repopulated in the UI first; each template loads into the bot with per-item error handling. UI no longer clears if a single template fails.
- Fixed a NameError when toggling branch debug by reading templates from
self.configwithin the handler; the toggle no longer disrupts UI refresh.
-
Configuration safety and startup stability
- Guarded auto‑save during startup with a suppression flag to prevent writing an empty
config.jsonwhile UI constructs. - Wrapped early calls like
toggle_failsafe_uiso they don’t override the saved configuration before load completes. - Prevented template tab auto‑switch during config reloads; GUI now defaults to the Sequences tab at startup and after reload.
- Guarded auto‑save during startup with a suppression flag to prevent writing an empty
-
Sequence editor robustness
- Rebuilds a fresh
SequenceEditorwhenever selection changes to avoid stale state and disappearing steps. - Added a
_rebuilding_sequence_editorguard so config mirrors don’t write empty steps during transitions. - Preserves existing steps if an editor snapshot returns empty.
- Added
SequenceEditor.update_groups(...)so Group Call dropdowns stay in sync with current group names.
- Rebuilds a fresh
-
Groups management
- Fixed a bug where deleted groups reappeared: prevented
update_config_from_uifrom writing stalegroup_editor_widgetcontents back if the group was removed; cleared editor references after deletion.
- Fixed a bug where deleted groups reappeared: prevented
-
Failsafe improvements
- Web → GUI template sync: desktop now reads both
failsafe.template_nameandfailsafe.templateand preserves selection during combo refresh. Server writes both keys for full symmetry. - Web editor now supports adding normal steps (not only Group Calls) and “Use Preview as Failsafe Region”.
- Action editor mirrors Sequences behavior: shows only relevant fields per action type (wait → seconds, move_to/drag → X/Y/duration/random/region, keypress → key/modifiers, scroll → pixels, click → button/clicks).
- Random movement: web Failsafe editor saves
randomandrandom_regionso GUI runs move‑to in random mode correctly.
- Web → GUI template sync: desktop now reads both
-
Web UX and auth
- Added top‑nav links for a consistent flow between Dashboard, Sequences, Groups, Controls, Schedules, Templates, Failsafe.
- “Remember token” checkbox added to all pages (Dashboard/Controls/Sequences/Groups/Templates/Failsafe/Schedules); tokens sync via
localStorageand auto‑populate across pages. - Sequences editor: added a “Group” dropdown next to “Add Group Call”; loads groups before rendering; always displays the selected group even if not yet in the global list.
-
Defaults and quality of life
- New steps (GUI and Web) carry sensible defaults:
confidence,detection_strategy,step_loops,monitor, and a defaultactionslist. - Injects a global
search_regioninto new steps (when present) for faster authoring. - Group selectors default to the first available group for quick adds.
- New steps (GUI and Web) carry sensible defaults:
-
Templates (Web)
- Added Live Screen Preview to Templates with
MonitorandSizecontrols, plus a “Using Phone” toggle to enable touch selection and disable scroll while dragging. - Click‑drag (or touch‑drag) a selection box over the preview; coordinates are mapped to natural pixels regardless of preview size or zoom.
- New “Save Selection as Template” workflow: captures the selected region from the chosen monitor and saves it to
images/<name>.png, then auto‑registers it inconfig.jsonand shows the preview. - Backed by
GET /api/monitorsandGET /stream.mjpeg; capture endpointPOST /api/templates/capturestores the image and updates templates.
- Added Live Screen Preview to Templates with
- Python 3.8+ (Windows batch launcher checks 3.8+)
opencv-python,pyautogui,numpy,Pillow,PyQt6,pyqt6-tools,darkdetect,mss,psutil,pynput- Notes:
pynputenables global mouse/keyboard recording for the Recorder tab.- If you don’t need recording, you can skip
pynput.
- Install via
requirements.txt
- Install dependencies:
pip install -r requirements.txt - Configuration files:
config.json— main app configuration (created/updated by GUI/web server)server_config.json— web server settings (bind,port,token,mjpeg_fps,backup_retention)- Backups — stored in
backup configs/with retention (backup_retentiondefault 20)
- Clone or download this repository
- Install dependencies:
pip install -r requirements.txt - Launch the GUI:
- Cross-platform:
python bot_gui.py - Windows convenience launcher:
launch_gui.bat
- Cross-platform:
- Start the server (requires the same Python environment):
python web_server.py - Open
http://localhost:8765/in your browser. - Auth: the server reads
server_config.jsonfortoken; pages include a “Remember token” checkbox that persists your token across all editors vialocalStorage. - Live preview and region capture require the
msspackage. If the stream or capture returns blank, ensuremssis installed and accessible in your Python environment.
- Start the interactive launcher:
python launcher.py - Choose to launch the GUI, the Web Server, or both.
- If launching the Web Server, you can either reuse the last
server_config.jsonor edit settings (bind, port, token,mjpeg_fps,backup_retention) in the launcher and save them. - On Windows, you can alternatively use
launch.bat(PowerShell wrapper) for a menu-driven experience.
launcher.py(Tkinter GUI)- Options: Launch GUI, Launch Web Server, or Both
- “Use last settings” or “Edit and save new settings” for the web server
- Buttons: Start, Open Web Portal (opens
http://localhost:<port>/?token=...), Stop GUI, Stop Web Server, Exit - Compiled behavior: launcher runs without a console; GUI (
ImageDetectionBot.exe) and server (WebServer.exe) run with console windows for logs.
- Batch wrappers
run_launcher.bat: startslauncher.py(prefersvenv\Scripts\python.exe, falls back topython)launch.bat: invokeslaunch.ps1(menu-driven console)
You can build the whole project in one go using the provided spec:
pip install pyinstaller
pyinstaller ProjectBundle.spec
This creates a dist/ProjectBundle/ folder containing:
Launcher.exe— Tkinter launcher to start GUI and/or Web ServerImageDetectionBot.exe— the desktop GUIWebServer.exe— the HTTP server for the web editors
Static assets (web pages, images, failsafe images) and configs (server_config.json, config.json) are included in the bundle.
Notes:
- If PyInstaller misses dependencies for your environment, add them to
hiddenimportsinProjectBundle.spec. - For one‑file (
--onefile) builds, prefer building each binary separately; multi‑exe onefile is not supported.
If you want a single executable that shows the GUI and keeps a visible console for debug logs:
pyinstaller --onefile SingleBotConsole.spec
This outputs dist/ImageDetectionBotConsole.exe, which launches the GUI and shows a console window with runtime logs. Assets and configs are included. Use this when you prefer seeing logs directly without opening the log file.
- Templates tab:
- Click Add Template, set a name and image path
- Use “Save Selection as Template” on the web Templates page: choose a monitor, drag to select a region on the live preview, enter a name, then save.
- Alternatively, use GUI capture or Load Image to pick a file
- Preview auto-updates
- Sequences tab:
- Add a sequence, then add steps
- For each step: set
findtemplate,required,timeout, optionalconfidence - Add actions like Click, Move, Move-To, Type, Key Press, Wait, Scroll, Play Recording
- For Click: optionally select a region to click randomly inside
- For Move-To: optionally enable random and select a region to move within
- Use
Show RegionorShow Clickto quickly visualize the current settings before running. - Run the selected sequence; press F8 to stop
- Template Tester tab:
- Pick a template, optionally select region, start live preview to see confidence value
- Failsafe tab:
- Enable failsafe, choose a template and confidence, optionally set a search region
- Build a separate “failsafe sequence” of steps
- Use
Show Regionto preview the configured failsafe region overlay. - Click Test Failsafe to run only the failsafe sequence
Screenshots illustrating the GUI and Web editors are stored under docs/screenshots/.
Suggested filenames (drop your PNGs in that folder):
- GUI
gui_sequences.png: Sequences tab with steps and actionsgui_failsafe.png: Failsafe tab with settings and sequencegui_templates.png: Templates tab with preview and actions
- Web
web_dashboard.png: Dashboard page with preview and navweb_sequences.png: Sequences editor (steps + actions + preview)web_failsafe.png: Failsafe editor (settings + steps + preview)web_groups.png: Groups editor (list + steps + nested actions)web_schedules.png: Schedules page (rows with Enabled/Sequence/Time)
Screenshot links (not embedded):
- GUI
- Web
The project includes a lightweight web server with browser-based editors that mirror most GUI features. Open pages via http://localhost:8765/static/... and append ?token=YOURTOKEN if auth is enabled.
-
Pages
Sequences(/static/sequences.html): sequence list, per-step editor, live previewFailsafe(/static/failsafe.html): failsafe settings and sequence editor, live previewGroups(/static/groups.html): group management (sequence-like collections), nested actions, live previewControls(/static/control.html): run/stop, non-required-wait toggle, “Run Group”Templates(/static/templates.html): templates list, path editor, live screen preview with region selection and “Save Selection as Template”
-
Fully implemented in web editors
- Per-step header controls (Sequences):
findtemplate,required,confidence,timeout,monitor,Step Loops,Detection Strategy,Min Inliers,Ratio,RANSAC,Select Search Region
- Per-step header controls (Sequences):
-
Per-action controls (Sequences/Failsafe):
type,button,clicks,x/y,duration,random,seconds,pixels,key,modifiers,Select Region,Set Random Region- “Add Action” palette with sensible defaults (click, right/double click, move/move_to/drag, type, wait, scroll, click_and_hold)
- Group Call steps: dropdown picker and save in Sequences and Failsafe; adds
{ call_group: "GroupName" }step - Groups editor: CRUD for groups; per-step header (
find,required,timeout, save, reorder/delete); nested actions with same controls as sequences/failsafe; “Add Action” palette - Live Preview on all editors (Sequences/Failsafe/Groups) with monitor selection; click‑drag selection draws a box and maps to natural image coordinates
- Live Preview on Templates with monitor selection; click‑drag selection and “Save Selection as Template” to capture and register a new template
- Preview “Size” dropdown (640/800/1024/1280) on all editors; region mapping remains accurate regardless of browser zoom or selected size
-
Partially implemented / known gaps
- Insert‑at‑index for new steps is supported in Sequences via “Add Step Here” (internally appends then reorders)
- Label polish and defaults can be tuned based on your workflow
-
Groups editor now exposes advanced step fields
- Per-step monitor selection with “Use Global” and specific monitor indexes
- Step Loops configuration
- Detection Strategy:
default(template) orfeature - Feature matcher params:
Min Inliers,Ratio,RANSAC
-
Web API endpoints (selected)
- Sequences
GET /api/sequences→ list namesGET /api/sequences/:name→ sequence detailsPUT /api/sequences/:name→ update metadata (loop,loop_count, rename)POST /api/sequences/:name/steps→ append stepPUT /api/sequences/:name/steps/:idx→ update stepDELETE /api/sequences/:name/steps/:idx→ delete stepPOST /api/sequences/:name/steps/reorder→ move a stepPOST /api/sequences/:name/steps/:idx/actions→ append actionPUT /api/sequences/:name/steps/:idx/actions/:aidx→ update actionDELETE /api/sequences/:name/steps/:idx/actions/:aidx→ delete action
- Templates
GET /api/templates→ list names, paths, existenceGET /api/templates/:name→ template detail (path, exists)PUT /api/templates/:name→ update path for a templateDELETE /api/templates/:name→ delete template mappingGET /api/template-image?name=<name>→ serve the template imagePOST /api/templates→ add a template mapping{ name, path }POST /api/templates/capture→ capture a selected screen region and save toimages/<name>.png(or providedpath), then update the templates map
- Preview & Monitors
GET /api/monitors→ list monitor bounds and indexesGET /stream.mjpeg?monitor=<index>→ MJPEG stream of selected monitor (PNG frames)POST /api/sequences/:name/steps/:idx/actions/reorder→ move an action
- Failsafe
GET /api/failsafe→ settingsPUT /api/failsafe→ update settingsGET /api/failsafe/sequence→ list stepsPOST /api/failsafe/sequence→ append stepPUT /api/failsafe/sequence/:idx→ update stepDELETE /api/failsafe/sequence/:idx→ delete stepPOST /api/failsafe/sequence/reorder→ move stepPOST /api/failsafe/sequence/:idx/actions→ append actionPUT /api/failsafe/sequence/:idx/actions/:aidx→ update actionDELETE /api/failsafe/sequence/:idx/actions/:aidx→ delete actionPOST /api/failsafe/sequence/:idx/actions/reorder→ move action
- Groups
GET /api/groups→ list group namesGET /api/groups/:name→ group details (supports dict‑ and list‑based storage)POST /api/groups→ createPUT /api/groups/:name→ update (rename, steps, loop, loop_count)DELETE /api/groups/:name→ deletePOST /api/groups/:name/steps→ append stepPUT /api/groups/:name/steps/:idx→ update stepDELETE /api/groups/:name/steps/:idx→ delete stepPOST /api/groups/:name/steps/reorder→ move stepPOST /api/groups/:name/steps/:idx/actions→ append actionPUT /api/groups/:name/steps/:idx/actions/:aidx→ update actionDELETE /api/groups/:name/steps/:idx/actions/:aidx→ delete actionPOST /api/groups/:name/steps/:idx/actions/reorder→ move action
- Controls
POST /api/run-options→ set non‑required‑wait (IPC to GUI)POST /api/run→ run sequence by name (IPC to GUI)POST /api/run-group→ create a temporary__RunGroup__sequence from a group and run it (IPC)POST /api/stop→ stop current run (IPC)
- Monitors & stream
GET /api/monitors→ JSON list of monitors (index, width, height)GET /stream.mjpeg?monitor=...&token=...→ MJPEG stream for previews
- Sequences
-
Region selection accuracy
- The web editors compute coordinates inside the actual image content box (
object-fit: contain) rather than the element bounds. This keeps natural pixel coordinates stable across preview size changes and browser zoom. - The overlay box is drawn in the pane at
paneOffset + contentOffset + contentCoord, so the visual selection always matches the drawn image.
- The web editors compute coordinates inside the actual image content box (
Tips
- After web edits, the server writes
config.json, creates time‑stamped backups in thebackup configs/folder (e.g.,config.backup.176257XXXX.json), prunes older ones according tobackup_retentioninserver_config.json, and triggers an IPC reload so the desktop GUI reflects changes without restart. - If you prefer GUI editing, you can mix and match; web and desktop stay in sync.
- Failsafe template name
- Web saves
failsafe.template_nameand the server also writesfailsafe.templatefor backward compatibility. - Desktop reads either key and preserves selection when repopulating the combo; preview updates immediately.
- Web saves
- Sequences and Failsafe actions
- Random movement: set
random=trueand arandom_regionin the action; desktop honors random move_to and random region clicks. - Action editors only expose fields relevant to the chosen type; saves mirror exactly what the desktop expects.
- Random movement: set
- Group Call steps
- Web Sequences and Failsafe editors both support Group Call steps; Sequences has a top‑level “Group” dropdown for quick adds.
- Editors load groups before rendering and ensure step selections are visible even if the group name wasn’t in the list yet.
The app reads/writes config.json in the script directory. Template paths are converted to relative paths when possible.
Top-level structure:
{
"templates": {
"TemplateName": "images/Template.png"
},
"sequences": [
{
"name": "Example",
"steps": [
{
"find": "TemplateName",
"required": true,
"confidence": 0.8,
"timeout": 10,
"search_region": [x, y, width, height],
"actions": [
{"type": "move_to", "duration": 0.5},
{"type": "click"},
{"type": "type", "text": "hello"},
{"type": "key_press", "key": "enter"},
{"type": "wait", "seconds": 0.5},
{"type": "scroll", "pixels": -300},
{"type": "click_and_hold", "duration": 1.0}
]
}
],
"loop": false,
"loop_count": 1
}
],
"failsafe": {
"enabled": true,
"template": "TemplateName",
"confidence": 0.8,
"region": [x, y, width, height],
"sequence": [
{
"find": "AnotherTemplate",
"required": true,
"timeout": 10,
"actions": [{"type": "click"}]
}
]
},
"break_settings": {
"enabled": true,
"max_runtime_seconds": 3600
}
}Action dictionary fields (as used across sequences and failsafe steps):
- Common:
type - Mouse actions:
button(left/right/middle),clicks(int)x,y(for absolutemove),durationregion(for click randomization),random,random_region(for random move-to)
- Keyboard:
text(type),key(key_press) - Timing/scroll:
seconds(wait),pixels(scroll)
- Template matching uses OpenCV (
cv2.matchTemplate). Confidence threshold is adjustable per step. - Feature matching supports multiple detectors and safe fallbacks:
- Detectors:
ORB(fast),AKAZE(scale-robust),SIFT(strong features; requiresopencv-contrib-python). - Pipeline: KNN + Lowe’s ratio → RANSAC homography → sanity check (area ratio) → center of detected polygon.
- Fallbacks: If detector fails, automatically tries
AKAZE, thenSIFT. If all fail, a multi‑scale template match runs. - Provide
strategy: "feature"and optionalmin_inliers,ratio_thresh,ransac_threshper step.
- Detectors:
- If a step has no
findtemplate, its actions run directly. move_tocan target the detected position or a random point inside a selected region.clickdefaults to current mouse location unlessforce_moveis used internally for region clicks.click_and_holdacts at the bot’s current position—usually set by a priormove/move_to.
- click:
button,clicks; optionalrandom+random_regionfor random click inside region - move: absolute
x,y,duration; fallback to detected position in a step when started via web (MOVE withoutx,ytreated like MOVE_TO) - move_to: detected template position or explicit
x,y+duration; optionalrandom+random_region - type:
text - key_press:
key, optionalmodifiers - wait:
seconds - scroll:
pixels - click_and_hold:
durationat current position
- Toolbar Monitor Selector:
- Choose All Monitors or a specific screen; status bar shows the active region.
- Capture dialogs (Templates tab) respect the selection.
- Per-Step Monitor Override:
- In the Step Editor, set the step's Monitor; this constrains detection to that screen when no
search_regionis set. monitorpersists inconfig.jsonasnull,"ALL", or[x, y, w, h].
- In the Step Editor, set the step's Monitor; this constrains detection to that screen when no
- Region Selection:
- Select Search Region opens an overlay; if a monitor is chosen, overlay is restricted to that screen.
- Regions and monitors can be combined; region takes precedence over monitor.
- Runtime Capture:
- Uses per-monitor
QScreen.grabWindow(...)withdevicePixelRatio()for DPI-aware capture. - For full desktop, frames are stitched from each monitor according to virtual desktop coordinates.
- Fallbacks:
PIL.ImageGrab.grab(bbox=...)orpyautogui.screenshot(region=...)when needed.
- Uses per-monitor
- The worker periodically checks the configured failsafe template (
check_failsafe_trigger). - When detected, the failsafe sequence executes (
execute_failsafe_sequence). - The Failsafe tab’s Test button runs only the failsafe sequence, using the current GUI configuration.
- Use the Break Settings tab to set a maximum runtime using
hours,minutes, andseconds. - When enabled, the cap applies to total runtime and overrides any sequence loop count.
- The status bar shows a live clock:
Elapsed: Hh Mm Ss / Max: Hh Mm Ss. - The bot stops automatically when elapsed ≥ max runtime.
- These values persist to
config.json:break_settings.enabled: booleanbreak_settings.max_runtime_seconds: integer seconds
- Optional final sequence on break:
- Toggle:
Run final sequence when time is hitand choose the sequence from the dropdown. - Behavior: when max runtime is reached, the primary run ends and the selected final sequence is launched once. The final sequence ignores the runtime cap and runs to completion.
- Live refresh: the dropdown updates immediately when you add, delete, rename, or duplicate sequences — no restart required.
- Config fields:
break_settings.run_final_after_break: booleanbreak_settings.final_sequence_name: string (sequence name) or"(none)"
- Toggle:
- Use the Scheduled Sequences tab to start specific sequences automatically at a given time each day.
- Each schedule row has:
Enabled: whether the schedule is activeSequence: the sequence to runTime: daily start time (HH:mm, 24-hour)
- Behavior toggles:
Queue if busy: if a run is already in progress at the scheduled time, starts the scheduled sequence as soon as the current run completes.Preempt if busy: stops the current run immediately and starts the scheduled sequence.Resume previous: when preempting, resumes the original run after the scheduled sequence finishes.- Queue and Preempt are mutually exclusive when saved; if Preempt is enabled, Queue is saved as off.
- Behavior:
- At the scheduled time, the app starts the selected sequence once per day.
- If another run is already in progress, the scheduled run is skipped for that day.
- Scheduled runs ignore the max runtime cap (they run to completion unless stopped).
- The scheduler checks every 30 seconds.
- Persistence:
scheduled_sequences: array of schedule objects persisted toconfig.json, each with:enabled: booleansequence_name: stringtime:hh:mm AM/PM(12-hour); loader accepts legacyHH:mm.queue_if_busy: booleanpreempt_if_busy: booleanresume_previous: booleanlast_run_date:YYYY-MM-DD(used to ensure only one run per day)
- Tips:
- Make sure your sequence exists and is valid before scheduling.
- You can add/remove schedule rows anytime; changes are saved with the configuration.
- New schedule rows default to
Enabled = off— toggle on to activate. - The status bar shows
Next: <Sequence> @ hh:mm AM/PMand updates immediately when you edit schedule rows.
- The top toolbar includes quick actions for configuration:
New: create a fresh configurationOpen: load an existingconfig.jsonSave: write current configurationSave As: save configuration to a new file- These mirror the File menu and make switching configs faster.
- The scheduler runs only while the app is open.
- Time parsing supports both
hh:mm AM/PMandHH:mm. - When a run is active at the scheduled time:
- With
Queue if busyon, the scheduled run starts right after the active run completes. - With
Preempt if busyon, the current run stops and the scheduled run starts immediately; ifResume previousis on, the original run resumes automatically after the scheduled run completes. - With neither on, the scheduled run is skipped for that minute.
- With
- Tips:
- Verify with a small cap (e.g.,
0h 0m 10s). - Ensure
Enable Max Runtimeis checked; a cap of0disables enforcement. - Check
bot_debug.logfor entries likeMax runtime reached (...).
- Verify with a small cap (e.g.,
- Runtime logs:
bot_debug.log - On failed matches, screenshots and templates may be saved under a
debug/folder next to the script - Press F8 to stop sequences; the status bar shows progress and messages
- Action metrics: the engine records per‑action timing results (type, success, elapsed seconds). Entries are summarized in
bot_debug.logand retained in memory for recent actions.
- Strategy Dropdown: Default (template) or Feature (scale/rotation).
- Parameters: Min Inliers, Ratio, RANSAC thresholds.
- Visualizer controls:
- Detector: choose
ORB,AKAZE, orSIFTfor the feature preview. - Show Keypoints: toggle overlay of scene keypoints for visual debugging (off by default for performance).
- Detector: choose
- Capture Backend:
- Options:
Auto (best),MSS,QScreen. - Auto prefers
MSSwhen available, otherwise usesQScreen. - An availability indicator shows whether the selected backend is usable.
- Options:
- Debug Panel shows:
- Target screen index, geometry, local capture rect
- Frame size and bytes-per-line, DPI ratio, pixmap state
- Backend used (MSS/QScreen) and fallback path (PIL bbox, pyautogui region, stitched full desktop)
- Metrics: Inliers, Matches, Confidence, RANSAC reprojection error.
- Toolbar button opens a dialog listing monitors: index, geometry, and DPI ratio.
- Capture All stitches a preview from all monitors to validate layout.
- Use small, high-contrast templates of the exact UI you want to detect
- Avoid scale/rotation changes; match works best for identical sizes
- Consider using
search_regionto narrow detection for speed and accuracy - For multi-monitor setups, prefer per-step monitor selection or regions on the target screen.
- Use Feature strategy for rotated/scaled UI elements; increase
min_inliersor adjustratio_threshwhen noisy. - For random click/move regions, ensure coordinates are on-screen and correct
- Template matching is sensitive to scaling and rotation
- Feature matching adds overhead; tune thresholds for your scene.
- Some capture backends may behave differently under extreme DPI or exotic layouts; robust fallbacks are implemented.
- Some GUI interactions are evolving; if you hit issues (e.g., editing failsafe steps), check logs and report
- Not all advanced scenarios are fully implemented; contributions are welcome
- Issues and PRs are appreciated
- Keep changes focused and documented
- Please avoid using this tool for anything that breaks app/game ToS
This project is for personal use; choose an appropriate license before public release if needed
- IPC and compiled mode
- Web server writes
ipc_command.jsonto the executable folder when compiled; GUI reads and deletes it after handling. - Web server serves static pages from
web/static; compiled builds also fall back to_internal/web/static. - Launcher’s “Open Web Portal” targets
http://localhost:<port>/?token=...to avoid0.0.0.0in browsers.
- Web server writes
- Web-start runs
- The GUI applies
break_settings.enabledandbreak_settings.max_runtime_secondsfromconfig.jsonwhen sequences are started via web IPC. - Status bar shows
Max runtime: Hh Mm Ssand live elapsed. - On cap hit, the run stops and runs the configured final sequence once (if set).
- The GUI applies
- Web portal 404 on root (
/?token=...) in compiled build:- Launch
WebServer.exefromdist/ProjectBundle; compiled server serves fromweb/staticand_internal/web/static.
- Launch
- Launcher restarts itself instead of starting GUI/server:
- Use
Launcher.exefromdist/ProjectBundle(same folder asImageDetectionBot.exeandWebServer.exe).
- Use
- Web-start doesn’t show current step in GUI:
- GUI switches to Sequences, selects the active sequence, and brings window to foreground.
- “MOVE action requires x and y coordinates” after web-start:
- MOVE without
x,yuses the detected position in the step (treated like MOVE_TO) to keep actions running.
- MOVE without
- Random movement not saved in web failsafe:
- Web editors save
randomandrandom_region; compiled GUI honors random move_to and random-region clicks.
- Web editors save
- MJPEG preview shows
net::ERR_ABORTEDoccasionally:- That’s a reconnect artifact; it does not affect saving, IPC, or UI updates.
- Templates list appears but bot fails to load a template:
- The UI is designed to stay populated even if individual templates fail to load into the bot.
- Check
bot_debug.logfor the failing template path; ensure the image exists at the resolved absolute path. - Use the Templates tab to update the path or re-capture and save the template.
- Recorder tab:
- Start Recording to capture global mouse moves, clicks, scrolls, and key presses (requires
pynput). - Stop to end capture, then optionally delete selected events.
- Save… prompts for a friendly name; stores
recordings/<name>.jsonwitheventsandname. - Preview Overlay replays visually without performing any input.
- Playback executes the recording now.
- Use Play Recording in any step editor to run a saved recording; pick from the dropdown that lists files in
recordings/. - Pause/Resume: temporarily pause capture during a session; resuming preserves timing and continues appending events.
- Add Marker: insert labeled markers into the timeline to annotate moments.
- Library Bar: browse and manage saved recordings — Refresh list, Open into the table, Play immediately, Rename, Delete.
- Play Recording action fields:
- Recording: select from the
recordings/folder via dropdown. - Speed: multiplier to compress or expand event intervals (e.g.,
2.0runs twice as fast). - Start Transition (s): optional smooth pre‑roll move to the first recorded position (set to
0.0for auto distance‑based duration).
- Recording: select from the
- Implementation details:
- Speed scales per‑event intervals; internal
pyautogui.PAUSEis disabled during playback to avoid unintended delays. - Start Transition uses either the configured duration or an auto duration based on cursor distance and speed.
- Playback and action results are logged to
bot_debug.logfor traceability.
- Speed scales per‑event intervals; internal
- Start Recording to capture global mouse moves, clicks, scrolls, and key presses (requires