For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
WebsiteStatusSupportDashboard
DocumentationAPI ReferenceMCPSDKsCLI (new)What's New?
DocumentationAPI ReferenceMCPSDKsCLI (new)What's New?
  • Get started
    • Introduction
    • Phone calls
    • Web calls
    • Vapi Guides
    • Composer
    • CLI quickstart
  • Assistants
    • Quickstart
      • Variables
      • Multilingual support
      • Personalization with user information
      • Voice formatting plan
      • Flush syntax
      • Background messages
      • Idle messages
      • Assistant hooks
      • Background speech denoising
      • Pronunciation dictionaries
      • Email address reading
    • Tools
    • Custom keywords
    • Custom voices
    • Custom transcriber
    • Custom TTS
  • Observability
    • Boards
  • Squads
    • Quickstart
    • Overview
    • Handoff tool
    • Passing data between assistants
  • Best practices
    • Prompting guide
    • Debugging voice agents
    • Enterprise environments (DEV/UAT/PROD)
    • IVR navigation
  • Phone numbers
    • Free Vapi number
    • Inbound SMS
    • Phone Number Hooks
  • Calls
    • Call end reasons
    • Troubleshoot call errors
  • Outbound Campaigns
    • Quickstart
    • Overview
  • Chat
    • Quickstart
    • Streaming
    • Non-streaming
    • OpenAI compatibility
    • Session management
    • Variable substitution
    • SMS chat
    • Web widget
    • Webhooks
  • Workflows
    • Quickstart
    • Overview
LogoLogo
WebsiteStatusSupportDashboard
On this page
  • Overview
  • How voice input formatting works
  • Customizing the formatting plan
  • Enabled
  • Number-to-digits cutoff
  • Replacements
  • Turning formatting off
  • Summary
AssistantsConversation behavior

Voice formatting plan

Format LLM output for natural-sounding speech

Was this page helpful?
Edit this page
Previous

Flush syntax

Control voice transmission timing for responsive conversations
Next
Built with

Overview

Voice formatting automatically transforms raw text from your language model (LLM) into a format that sounds natural when spoken by a text-to-speech (TTS) provider. This process—called Voice Input Formatted—is enabled by default for all assistants.

Formatting helps with things like:

  • Expanding numbers and currency (e.g., $42.50 → “forty two dollars and fifty cents”)
  • Expanding abbreviations (e.g., ST → “STREET”)
  • Spacing out phone numbers (e.g., 123-456-7890 → “1 2 3 4 5 6 7 8 9 0”)

You can turn off formatting if you want the TTS to read the raw LLM output.

How voice input formatting works

When enabled, the formatter runs a series of transformations on your text, each handled by a specific function. Here’s the order and what each function does:

StepFunction NameDescriptionBeforeAfterDefaultPrecedence
1removeAngleBracketContentRemoves anything within <...>, except for <break>, <spell>, or double angle brackets << >>.Hello <tag> worldHello world✅-
2removeMarkdownSymbolsRemoves markdown symbols like _, `, and ~. Asterisks (*) are preserved in this step.**Wanted** to say *hi***Wanted** to say *hi*✅0
3removePhrasesInAsterisksRemoves text surrounded by single or double asterisks.**Wanted** to say *hi* to say❌0
4replaceNewLinesWithPeriodsConverts new lines (\n) to periods for smoother speech.Hello world\n to say\nWe have NASAHello world . to say . We have NASA✅0
5replaceColonsWithPeriodsReplaces : with . for better phrasing.price: $42.50price. $42.50✅0
6formatAcronymsConverts known acronyms to lowercase (e.g., NASA → nasa) or spaces out unknown all-caps words unless they contain vowels.NASA and .NETnasa and .net✅0
7formatDollarAmountsConverts currency amounts to spoken words.$42.50forty two dollars and fifty cents✅0
8formatEmailsReplaces @ with “at” and . with “dot” in emails.JOHN.DOE@example.COMJOHN dot DOE at example dot COM✅0
9formatDatesConverts date strings into spoken date format.2023 05 10Wednesday, May 10, 2023✅0
10formatTimesExpands or simplifies time expressions.14:0014✅0
11formatDistances, formatUnits, formatPercentages, formatPhoneNumbersConverts units, distances, percentages, and phone numbers into spoken words.5km, 43 lb, 50%, 123-456-78905 kilometers, forty three pounds, 50 percent, 1 2 3 4 5 6 7 8 9 0✅0
12formatNumbersFormats general numbers: years read as digits, large numbers spelled out, negative and decimal numbers clarified.-9, 2.5, 2023minus nine, two point five, 2023✅0
13removeAsterisksRemoves all asterisk characters from the text.**Bold** and *italic*Bold and italic✅1
14Applying ReplacementsApplies user-defined final replacements like expanding street abbreviations.320 ST 21 RD320 STREET 21 ROAD✅-

Customizing the formatting plan

You can control some aspects of formatting:

Enabled

Formatting is on by default. To disable, set:

1voice.chunkPlan.formatPlan.enabled = false

Number-to-digits cutoff

Controls when numbers are read as digits instead of words.

  • Default: 2025 (current year)
  • Example: With a cutoff of 2025, numbers above this are read as digits.
  • To spell out larger numbers, set the cutoff higher (e.g., 300000).

Replacements

Add exact or regex-based substitutions to customize output.

  • Example 1: Replace hello with hi:
    1{ type: 'exact', key: 'hello', value: 'hi' }
  • Example 2: Replace words matching a pattern:
    1{ type: 'regex', regex: '\b[a-zA-Z]{5}\b', value: 'hi' }

Currently, only replacements and the number-to-digits cutoff are customizable. Other options are not exposed.


Turning formatting off

To disable all formatting and use raw LLM output, set either of these to false:

1voice.chunkPlan.enabled = false
2// or
3voice.chunkPlan.formatPlan.enabled = false

Summary

  • Voice input formatting improves clarity and naturalness for TTS.
  • Each transformation step targets a specific pattern for better speech output.
  • You can customize or disable formatting as needed.