For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Control how your AI assistant pronounces specific words and phrases
Overview
Pronunciation dictionaries allow you to customize how your AI assistant pronounces specific words, names, acronyms, or technical terms. This feature is particularly useful for ensuring consistent pronunciation of brand names, proper nouns, or industry-specific terminology that might be mispronounced by default.
Pronunciation dictionaries are supported by the following voice providers:
ElevenLabs — phoneme rules (IPA and CMU Arpabet) and alias rules
Cartesia — “sounds-like” aliases and IPA notation (sonic-3 model only)
Vapi built-in voices — pronunciation dictionaries via a unified locator
Create a pronunciation dictionary using either the ElevenLabs or Cartesia API endpoints shown above. The dictionary ID from either provider can be used with Vapi built-in voices.
Create a test call or use the Vapi playground to verify that your custom pronunciations are working correctly.
Using Your Own ElevenLabs Account (BYOK)
If you’re using your own ElevenLabs API key (Bring Your Own Key), you can create pronunciation dictionaries directly in your ElevenLabs account and reference them in Vapi:
Create a pronunciation dictionary in your ElevenLabs account
Note the pronunciationDictionaryId and versionId from ElevenLabs
Use these IDs in your Vapi assistant configuration:
Case Sensitivity: Pronunciation dictionary searches are case-sensitive. Create separate entries for different capitalizations if needed.
Order Matters: Rules are applied in the order they appear in the dictionary. The first matching rule is used.
Testing: Always test pronunciation changes with your specific voice and model combination.
Phoneme Accuracy: Ensure proper stress marking for multi-syllable words when using phoneme rules.
Model Compatibility: ElevenLabs phoneme rules only work with eleven_turbo_v2 and eleven_flash_v2. Cartesia pronunciation dictionaries require the sonic-3 model.
Common Issues
Pronunciation Not Applied
Verify you’re using a compatible model (ElevenLabs phoneme rules need specific models; Cartesia needs sonic-3)
Check that the word to replace exactly matches the text in your content (case-sensitive)
Ensure the pronunciation dictionary is properly referenced in your voice configuration
SSML Conflicts
When pronunciation dictionaries are enabled, SSML parsing is automatically activated
Ensure any existing SSML tags in your content are properly formatted
Performance Impact
Large dictionaries may slightly increase processing time
Consider organizing rules by frequency of use for optimal performance