plugin
The community/plugin directory serves as the primary hub for extending Talon's core execution model with custom features, rich user interfaces, and specialized input behaviors. It houses modular plugins that provide cross-platform desktop management, alternative hardware inputs (like game controllers and eye-trackers), advanced text selection algorithms, and interactive on-screen overlays.
By separating logic into independent plugins, this architecture ensures developers can customize or troubleshoot specific modules—such as on-screen subtitles or voice macros—without interfering with the rest of their configuration.
Core Optimization & Configuration Files
The root of this directory contains general optimization and configuration files that globally adjust input behaviors:
- paste_to_insert.py – Optimizes text entry speed and reliability. It intercepts Talon's default
insertactions. If the text character count exceeds the limit defined inuser.paste_to_insert_threshold, it programmatically executes a "paste" command instead of typing the characters key-by-key. - eye_tracking_settings.py – Serves as an environment configuration file for calibrating zoom-mouse scales and window boundaries across macOS and Windows operating systems.
- README.md – A high-level directory map listing user-facing behaviors for each nested plugin.
Visual Overlays & UI Feedback
Talon relies heavily on transparent on-screen canvases and interactive graphical interfaces to guide users, log history, and verify state changes.
- subtitles – Replaces default subtitles with a fully styled, customizable subtitle renderer. It intercepts recognized words, scales text bounding boxes iteratively to prevent off-screen overflows, and draws outline-wrapped text using the SkiaCanvas API.
- mode_indicator – Spawns a floating, transparent color bubble on-screen that represents the current active Talon state (e.g., Command mode, Dictation mode, or Sleep mode) and shows which hardware microphone is currently active.
- command_history – Tracks recently executed actions in a rolling history buffer and displays them in a compact, collapsible IMGUI (Immediate Mode GUI) list.
- are_you_sure – Provides a safety mechanism for destructive voice commands (like shutting down the computer or closing applications). It halts standard execution, forces a warning popup on screen, and registers a temporary
user.are_you_surecontext tag to restrict active voice commands strictly to confirmation options. - breaking_changes_notice – Monitors changes to the repository's underlying schema. It watches local size states and displays an on-screen warning if updates require manually adapting user custom commands.
- new_user_message – Detects a first-time installation and automatically opens a friendly interactive onboarding wizard to teach fundamental speech principles.
Alternative Input & Accessibility Methods
This group of plugins bridges the gap between hardware controls, mouth noises, and eye-tracking systems to enable seamless hands-free navigation.
- mouse – A mouse management system that maps voice triggers to clicks, drags, and scroll loops. It includes continuous scroll systems, "hiss" noise activation, and "gaze scrolling" (translating eye movements into relative scrolling speeds).
- gamepad – Adapts physical video game controllers as pointing devices. It uses a non-linear acceleration algorithm to scale thumbstick inputs into mouse coordinates and features a canvas-based gamepad_tester to inspect and calibrate controller bindings.
- repeater – Rapidly duplicates previous commands using spoken ordinals ("twice", "third") or by checking acoustic triggers, such as popping your lips twice in quick succession.
- cancel – Intercepts phrases before execution to discard them if the phrase ends with a keyword (e.g., "cancel cancel"). It also prevents stale, buffered audio from executing immediately after waking Talon up.
- listening_timeout – Uses an asynchronous background cron timer to track user inactivity, automatically disabling microphone capture after a customizable threshold to save CPU and ensure privacy.
Advanced Text Editing & Composition
Because input fields in web forms and native applications often lack support for precise dictation, these plugins establish specialized text workspaces.
- draft_editor – Bridges Talon with an external full-featured text editor (like Visual Studio Code). It copies text from any obscure form field, spins up a temporary editor tab, tracks editing progress, and automatically pastes the updated draft back into the source window upon submission.
- talon_draft_window – Implements an internal floating text editing canvas. It maps a single-letter key ("anchor") to every visible word, allowing developers to execute operations like
"replace delta with beta"or"clear charlie through foxtrot". - text_navigation – Parses surroundings using regular expressions. Instead of sending raw cursor movements, it scans the editor buffer to let the user select, delete, cut, or hop relative to characters, camelCase fragments, or parentheses.
- symbols – Handles multi-character sequences, wrapping selections within brackets or quotation marks, and manages deprecation warnings when legacy symbol terms are spoken.
- datetimeinsert – Computes and inserts high-resolution or standard ISO timestamps in local and Coordinated Universal Time (UTC) formats.
Hardware and System Integration
These modules bridge the core Python layer with OS-level services and APIs.
- desktops – A unified cross-platform interface for switching virtual workspaces and dragging active windows between spaces. It maps OS-level commands natively on Linux (
ui), Windows (shortcuts), and macOS (Mission Control accessibility parameters). - screenshot – Manages screen captures. It resolves OS-level differences (such as triggering Windows Snipping Tool or macOS capture shortcuts) to save structured, timestamped PNGs to your Pictures directory or clipboard.
- microphone_selection – Interfaces with cross-platform audio context libraries (
cubeb) to poll, filter, and switch active system microphones on the fly using voice commands or an IMGUI dropdown. - media – Tracks platform differences to send standard playback, track skip, and volume adjustment signals to the host system.
Developer and Automation Utilities
The final group helps developers construct and debug their speech workspace configurations.
- macro – Records consecutive spoken phrases, replays sequences dynamically through Talon's mimicry API, lists saved templates on screen, and formats macro events into copy-pasteable Talon code.
- talon_helpers – Contains developer tools to dump active registry scopes and copy localized bundle IDs to the clipboard. It also features create_app_context.py to automatically scaffold boilerplate python and talon files for unrecognized applications.
- then – Implements an explicit semantic separator
thenwhich executes a safeskip()action. This helps Talon's parser isolate sequential command phrases and resolve grammar ambiguities when chaining actions. - dropdown – Catches older, legacy selection prompts and issues deprecation logs instructing developers on updated terminology.