Skip to content

modes

This directory contains the core logic for managing and transitioning between different Talon Modes. Talon relies on modes to scope which voice commands are active at any given time. For instance, commands that write code should not be active when dictating an email, and commands that control the computer should be inactive when Talon is asleep.

The files in this directory organize these responsibilities into four primary areas: 1. Mode Infrastructure & Core Switching 2. Dictation Mode Implementation 3. Sleep, Wake, and Engine Integration 4. Programming Language Detection & Overrides


Mode Infrastructure & Core Switching

The core mechanics of setting up, transitioning between, and initializing modes are governed by python logic and simple mode-switching talon files.

  • modes.py defines the central python API actions for switching modes (command_mode(), dictation_mode(), talon_mode(), and dragon_mode()). It registers the extra "presentation" mode and manages context matching based on whether the sleep mode is active.
  • initial_mode.py reads the initial_mode configuration setting on launch to boot Talon into the user's preferred default state (command, sleep, or dictation).
  • command_and_dictation_mode.talon provides the explicit voice commands ("dictation mode" and "command mode") to transition directly between Talon's two primary awake states.

Dictation Mode Implementation

When Talon is in dictation mode, free-form speech is transcribed as raw prose rather than executed as application commands.

  • dictation_mode.talon implements the rules for this state. It routes all captured prose through user.dictation_insert() to maintain appropriate spacing and capitalization context. It also exposes localized commands for inline formatting (e.g., caps, no-caps, spacing overrides), text navigation, word/character selection, and correction commands (such as "scratch that", "select that", or spelling out letters).

Sleep, Wake, and Engine Integration

A critical aspect of using Talon is the ability to easily toggle voice recognition off (sleep) and on (wake) to prevent false positives during conversations. This system features specialized configurations tailored to different speech engines and environments:

Core Sleep Behaviors

  • sleep_mode.talon modifies default mouse behaviors when sleeping, disabling pop-clicks, continuous scrolls, and hiss scrolls to prevent accidental physical interactions from background noise.

Engine-Specific Handlers

  • sleep_mode_dragon.talon & dragon_mode.talon: Dragon manages sleep states natively. These files coordinate sleep transitions specifically for users running the Dragon speech engine (Mac/Windows), avoiding conflicts between Dragon commands and Talon commands.
  • sleep_mode_not_dragon.talon & modes_not_dragon.talon: Standardize sleep commands (like "go to sleep" or "sleep all") for non-Dragon backends. They utilize an optional trailing [<phrase>] capture parameter to immediately disable speech processing even when trailing conversation is captured in the same utterance.
  • sleep_mode_wav2letter.talon: Introduces a high-priority catch-all skip command for the wav2letter engine to prevent accidental wakeups from background noise.

Wake-Up Mechanisms

  • sleep_mode_wakeup.talon: Configures fully-anchored wake-up commands such as "welcome back". The parser rules use repetition tokens + to work around situations where users repeatedly say the wake command without waiting for the speech-recognition engine's pause timeout.
  • sleep_mode_pop_twice_to_wake.py: Provides a non-verbal wake mechanism. When the pop_twice_to_wake tag is active, emitting two popping noises within a specified speed window will instantly wake Talon.

Deep Sleep

  • deep_sleep.py & sleep_mode_deep.talon: Provides a highly-isolated sleep state intended for meetings or loud environments. Once the user.deep_sleep tag is applied, normal wake commands are disabled, requiring the explicit, multi-word phrase "wake up and listen" to exit sleep mode.

Programming Language Detection & Overrides

This subsystem automates or forces contextual rules based on the programming language of the active file buffer.

  • code_languages.py maintains a centralized registry of programming languages, mapping language IDs to their spoken names (e.g., "see sharp" for csharp) and typical file extensions. It also handles special file name matching (e.g., mapping CMakeLists.txt to the cmake language).
  • language_modes.py implements the contextual resolution engine. It automatically tracks the active window's file extension to decide the coding context, but permits manual overrides by setting a global forced_language variable and activating the user.code_language_forced tag.
  • language_modes.talon exposes voice control over this system, letting you manually enforce, clear, or query the programming language scope (e.g., "force python" or "clear language mode").