formatters
This directory implements the core formatting and text transformation engine for Talon. It enables users to dictate text and instantly format it into various styles (such as camelCase, snake_case, or Title Case), reformat existing text, and handle spacing or quotes.
The system is highly extensible, dividing formatting into a Python-based processing pipeline and a set of Talon list files that map natural spoken words to specific formatting rules.
Core Architecture
At the center of this directory is formatters.py, which defines the object-oriented structure for text transformations, registers the available formatters, and exposes Talon captures and actions.
Formatter Class Hierarchy
formatters.py defines an abstract base class Formatter and several specialized subclasses:
CustomFormatter: Applies arbitrary Python lambda or function transformations to text (e.g., surrounding text in quotes or adding trailing spaces).CodeFormatter: Handles standard programming formatting. It splits text into words, filters out invalid characters, and applies separate formatting functions to the first word versus subsequent words, joining them with a defined delimiter (e.g.,_for snake case or empty string for camel case).TitleFormatter: Implements smart title casing, maintaining lower case for minor words (such as "and", "but", "or", "to") unless they are the first or last word of the phrase.SentenceFormatter: Capitalizes only the first word of a phrase if it is lowercase.CapitalizeFormatter: Capitalizes the first non-whitespace character of a block of text.
The Formatting and Unformatting Pipeline
The core engine supports both formatting (applying a style) and unformatting (reverting text to raw words before applying a new style).
format_text_without_adding_to_history: Processes text through a chain of formatters. If multiple formatters are specified (e.g.,ALL_CAPS,SNAKE_CASEviaconstant), they are processed in reverse order. It also preserves wrapping string delimiters (like',", or""") so that formatting only affects the text inside the quotes.remove_code_formatting: Used during reformatting to convert formatted identifiers back into a space-separated string of lowercase words. It splits apart delimiters (-,_,., etc.) and usesde_camelto break up camelCase boundaries.
Talon Voice Integration
formatters.py defines several Talon captures and actions that bridge voice input with the formatting engine.
Captures
<user.formatters>: Captures a sequence of spoken formatting commands (e.g., "snake double string") and returns them as a comma-separated list of formatter IDs (e.g.,"SNAKE_CASE,DOUBLE_QUOTED_STRING").<user.format_text>: Formats subsequent spoken text on the fly. It supports interspersedImmuneStringelements (such as spoken numbers or symbols captured via<user.formatter_immune>), which are inserted directly into the stream without being processed by the active formatter.<user.format_code>: Specifically handles programming-focused formatting chains.
Actions
The Python file exposes several core actions to the Talon environment:
user.insert_formatted(phrase, formatters): Formats and inserts a spoken phrase.user.formatters_reformat_last(formatters): Deletes the last formatted phrase inserted and re-inserts it using a new formatting style.user.formatters_reformat_selection(formatters): Reads the current editor selection, clears it, runs it through the unformatting/formatting pipeline, and inserts the newly formatted text.
Configuration and Spoken Vocabularies
The specific spoken words mapped to formatting behaviors are defined in four Talon list files. These files decouple user-facing vocabulary from the underlying Python logic.
-
code_formatter.talon-list: Maps code-centric spoken terms to their respective formatters. For example:
camelmaps toPRIVATE_CAMEL_CASE(camelCase)snakemaps toSNAKE_CASE(snake_case)constantmaps toALL_CAPS,SNAKE_CASE(CONSTANT_CASE)dub stringmaps toDOUBLE_QUOTED_STRING("text")
-
prose_formatter.talon-list: Mapped spoken words for standard dictation and prose. For example, speaking
sentencetriggersCAPITALIZE_FIRST_WORD, andtitletriggersCAPITALIZE_ALL_WORDS. -
reformatter.talon-list: Defines commands dedicated to reformatting selected text. Speaking
capappliesCAPITALIZE, andunformatstrips out formatting usingREMOVE_FORMATTING. -
word_formatter.talon-list: Configures basic word-level alterations like
trotfor adding aTRAILING_SPACEorproudtoCAPITALIZE_FIRST_WORD.