core
The community/core directory is the foundational core of the Talon voice control system's community repository. It defines essential APIs, base layers, state tracking, and engine abstractions to enable keyboard/mouse emulation, natural dictation, platform abstractions, multi-monitor window coordination, and interactive on-screen user interfaces.
For a high-level index of the commands contained in each subdirectory, refer to the README.md.
Configuration and Utility Foundations
The top-level files in this directory establish the plumbing that manages file-system observers, coordinates active application environments, dynamically registers spoken-form vocabulary, and schedules engine actions.
- User Customization and CSV Tracking: user_settings.py provides custom decorators such as
@track_csv_listand@track_fileto monitor configuration files in thesettings/orprivate/directories. This allows users to add vocabularies, rules, or file aliases that reload dynamically in Talon without requiring a restart. - Dynamic Phonetic and Spoken-Form Generation: create_spoken_forms.py implements the algorithms that translate arbitrary input strings into phonetically pronounceable sequences. It handles acronym expansions, splits camelCase/PascalCase boundaries, converts numeric values into spoken years (e.g.,
"2025"to"twenty twenty five"), and maps complex symbols to phonetic defaults using the symbol maps defined in keys/README.md. - Application Environment Tracking: app_running.py registers system-level lifecycle callbacks (
app_launchandapp_close) to populate Talon’s active registry with running background applications. application_matches.py contains platform-specific bundle and process patterns (e.g. for IDEs, terminal wrappers, and databases), while tags.py registers global community capability tags. - System Navigation: system_paths.py auto-generates host-specific
.talon-listfiles on startup, allowing direct reference to local user directories (like Desktop, Documents, or Talon recordings). - Deprecation Auditing: deprecations.py allows developers to gracefully phase out voice commands, actions, or captures by notifying users on-screen at regulated intervals.
Low-Level Speech and Event Controls
These files abstract physical microphone inputs and non-verbal noises to control Talon's underlying behavior:
- noise.py handles non-verbal vocalization events (like pops or hisses), utilizing debouncing timers to prevent speech from accidentally triggering scrolling or clicks.
- delayed_speech_off.py implements a lazy-disable mechanism for the microphone, turning off speech recognition only after the active phrase finishes processing.
- dragon_engine.py translates generic sleep, wake, and modal commands into speech-mimic actions specifically for environments using the Dragon Speech engine.
- system_command.py exposes global actions to execute shell utilities synchronously or asynchronously.
- described_functions.py is a developer-focused utility that appends custom docstrings to inline lambdas, facilitating registration within the dynamic help menus.
Major Subsystems and Components
Keyboard Input and Text Editing
The core keyboard and text processing layers are decoupled from specific physical layouts and operating systems.
- Keyboard Layouts and Keypress Emulation: keys/README.md establishes platform-neutral lists for standard letters, numbers, and functions. It works with platform adapters like keys/mac/README.md and keys/win/README.md to dynamically map modifiers (e.g., Command on Mac vs. Ctrl on Windows/Linux) to the host environment.
- Advanced Editing and Selection: edit/README.md defines cross-platform text navigation and selection logic. It implements compound editing commands (combining action verbs with target modifiers) and manages specialized boundaries such as paragraph scanning and cursor-centered delimiter insertions.
- Prose Navigation: navigation/README.md provides consistent browser, terminal, and desktop history navigation (forward and backward) using context-aware mapping wrappers.
Dictation, Formatting, and Vocabulary
A specialized pipeline translates spoken phrases into formatted, context-sensitive text.
[ Spoken Dictation ]
│
▼
[ text/README.md ] (Handles spacing, capitalization, and history)
│
▼
[ formatters/README.md ] (Applies camelCase, snake_case, etc.)
│
▼
[ vocabulary/README.md ] (Resolves replacements & user exceptions)
- Prose Capture: text/README.md implements state-aware prose capitalization, auto-spacing, sentence boundary detection, and phrase history.
- Case Transforming: formatters/README.md handles programming transformations (like
camelCaseorCONSTANT_CASE), title casings, and un-formatting routines that parse variables back into raw word streams. - Custom Lexicons: vocabulary/README.md runs a greedy prefix-tree engine allowing multi-word replacements and on-the-fly additions to personal dictionaries, while abbreviate/README.md leverages CSV mapping overrides to expand or output abbreviated technical text.
- Numbers and Ordinals: numbers/README.md provides a mathematical text parser to translate complex spoken numbers (decimals, signed integers, and scales like "one hundred and five thousand") into digits, alongside pre-computed ordinals (like "twenty-third") used to anchor selections.
Snippets, Structured Elements, and Command Client
These components provide context-sensitive coding blocks and external editor communication.
- Unified Snippet Engine: snippets/README.md utilizes a single
.snippetfile format with YAML-like context rules. When a snippet is spoken, the engine compiles it dynamically, substituting variables to match active programming language casings. - Snippet Libraries: The actual multi-language templates reside in snippets/snippets/README.md, organizing structures for loops, class declarations, exception handling, and framework-specific structures.
- External Editor Communication (RPC): command_client/README.md uses a file-based Remote Procedure Call interface to coordinate actions with external applications like VS Code.
- IPC Transport: The low-level transport details are managed in command_client/rpc_client/README.md. It handles temporary directory resolution, exclusive write-locks, backoff polling, and OS-safe unlinking to prevent file lockups.
- Structured Captures: Other modules inject structural text fragments. For instance, contacts/README.md parses structured JSON/CSV records to dictation targets (emails, names, usernames), file_extension/README.md inserts file types, and top_level_domain/README.md handles domain structures (e.g.
.com,.org).
Environment, UI, and Navigation Overlays
These subdirectories provide interfaces to navigate application windows, select items, and display interactive visual components.
- Window Management and Monitor Snapping: windows_and_tabs/README.md implements cross-platform tab control, handles window validation to discard hidden background processes, and calculates screen boundary coordinates to snap windows to fractional positions or move them across monitors.
- Multitasking overlays: app_switcher/README.md indexes launchable programs on Windows, Linux, and macOS. It handles custom spoken overrides and manages background application toggles.
- Monitor Display identification: screens/README.md overlays large numbers on connected monitors using high-contrast, typographically centered canvases.
- Recursive Mouse Grid: mouse_grid/README.md divides the screen into a 3x3 grid, enabling recursive narrowing and zoom-magnification on target regions to control mouse positions hands-free.
- Interactive Help System: help/README.md renders graphical references (lists, active contexts, active commands, and system scopes) to make documentation accessible by voice.
- Dynamic UI Selections: homophones/README.md implements databases and overlays to resolve phonetic homophone overlaps. Similarly, menu_choose/README.md emulates keyboard navigation to interact with system autocomplete dropdowns and context lists.
- State Management: stored_state_management/README.md provides a unified, repository-relative storage path for other modules to save flags, signal files, and persistent text configurations.
- Search and Web Integration: websites_and_search_engines/README.md parses search strings and redirects them to pre-configured query URLs, while providing browser address-bar utilities.
- Voice Configuration Utilities: edit_text_file/README.md allows users to say
"customize <file_alias>"to immediately open any registered configuration file in the system's default text editor. - Mode Orchestration: modes/README.md coordinates transitions between Command, Dictation, Sleep, and Deep Sleep states, and handles automatic language-specific mode overrides based on the active file extension.