Skip to content

formatters

This directory implements the core formatting and text transformation engine for Talon. It enables users to dictate text and instantly format it into various styles (such as camelCase, snake_case, or Title Case), reformat existing text, and handle spacing or quotes.

The system is highly extensible, dividing formatting into a Python-based processing pipeline and a set of Talon list files that map natural spoken words to specific formatting rules.

Core Architecture

At the center of this directory is formatters.py, which defines the object-oriented structure for text transformations, registers the available formatters, and exposes Talon captures and actions.

Formatter Class Hierarchy

formatters.py defines an abstract base class Formatter and several specialized subclasses:

  • CustomFormatter: Applies arbitrary Python lambda or function transformations to text (e.g., surrounding text in quotes or adding trailing spaces).
  • CodeFormatter: Handles standard programming formatting. It splits text into words, filters out invalid characters, and applies separate formatting functions to the first word versus subsequent words, joining them with a defined delimiter (e.g., _ for snake case or empty string for camel case).
  • TitleFormatter: Implements smart title casing, maintaining lower case for minor words (such as "and", "but", "or", "to") unless they are the first or last word of the phrase.
  • SentenceFormatter: Capitalizes only the first word of a phrase if it is lowercase.
  • CapitalizeFormatter: Capitalizes the first non-whitespace character of a block of text.

The Formatting and Unformatting Pipeline

The core engine supports both formatting (applying a style) and unformatting (reverting text to raw words before applying a new style).

  • format_text_without_adding_to_history: Processes text through a chain of formatters. If multiple formatters are specified (e.g., ALL_CAPS,SNAKE_CASE via constant), they are processed in reverse order. It also preserves wrapping string delimiters (like ', ", or """) so that formatting only affects the text inside the quotes.
  • remove_code_formatting: Used during reformatting to convert formatted identifiers back into a space-separated string of lowercase words. It splits apart delimiters (-, _, ., etc.) and uses de_camel to break up camelCase boundaries.

Talon Voice Integration

formatters.py defines several Talon captures and actions that bridge voice input with the formatting engine.

Captures

  • <user.formatters>: Captures a sequence of spoken formatting commands (e.g., "snake double string") and returns them as a comma-separated list of formatter IDs (e.g., "SNAKE_CASE,DOUBLE_QUOTED_STRING").
  • <user.format_text>: Formats subsequent spoken text on the fly. It supports interspersed ImmuneString elements (such as spoken numbers or symbols captured via <user.formatter_immune>), which are inserted directly into the stream without being processed by the active formatter.
  • <user.format_code>: Specifically handles programming-focused formatting chains.

Actions

The Python file exposes several core actions to the Talon environment:

  • user.insert_formatted(phrase, formatters): Formats and inserts a spoken phrase.
  • user.formatters_reformat_last(formatters): Deletes the last formatted phrase inserted and re-inserts it using a new formatting style.
  • user.formatters_reformat_selection(formatters): Reads the current editor selection, clears it, runs it through the unformatting/formatting pipeline, and inserts the newly formatted text.

Configuration and Spoken Vocabularies

The specific spoken words mapped to formatting behaviors are defined in four Talon list files. These files decouple user-facing vocabulary from the underlying Python logic.

  • code_formatter.talon-list: Maps code-centric spoken terms to their respective formatters. For example:

    • camel maps to PRIVATE_CAMEL_CASE (camelCase)
    • snake maps to SNAKE_CASE (snake_case)
    • constant maps to ALL_CAPS,SNAKE_CASE (CONSTANT_CASE)
    • dub string maps to DOUBLE_QUOTED_STRING ("text")
  • prose_formatter.talon-list: Mapped spoken words for standard dictation and prose. For example, speaking sentence triggers CAPITALIZE_FIRST_WORD, and title triggers CAPITALIZE_ALL_WORDS.

  • reformatter.talon-list: Defines commands dedicated to reformatting selected text. Speaking cap applies CAPITALIZE, and unformat strips out formatting using REMOVE_FORMATTING.

  • word_formatter.talon-list: Configures basic word-level alterations like trot for adding a TRAILING_SPACE or proud to CAPITALIZE_FIRST_WORD.