Pronunciation dictionaries
Overview
Pronunciation dictionaries allow you to customize how your AI assistant pronounces specific words, names, acronyms, or technical terms. This feature is particularly useful for ensuring consistent pronunciation of brand names, proper nouns, or industry-specific terminology that might be mispronounced by default.
Note: Pronunciation dictionaries are exclusive to ElevenLabs voices and require specific model configurations.
How Pronunciation Dictionaries Work
Create Pronunciation Rules
Define specific words or phrases and how they should be pronounced using either phonetic notation or word substitutions.
Upload Dictionary to Vapi
Create a pronunciation dictionary through Vapi’s API with your custom rules.
Sample Audio Examples
Below are examples demonstrating the difference between pronunciations with and without pronunciation dictionaries:
Corrected pronunciations:
- “Nginx” → “Engine-X” (using alias rule)
- “Kubernetes” → “/ˌkuːbərˈneɪtiːz/” (using phoneme rule)
Without Pronunciation Dictionary:
With Pronunciation Dictionary:
Prerequisites
- A Vapi assistant configured with an ElevenLabs voice
- Understanding of phonetic notation (IPA or CMU Arpabet) for phoneme-based rules
- Access to Vapi’s API for dictionary creation
Types of Pronunciation Rules
Phoneme Rules
Phoneme rules specify exact pronunciation using phonetic alphabets. These provide the most precise control over pronunciation.
Supported Alphabets:
- IPA (International Phonetic Alphabet): More universal, uses symbols like
/tə'meɪtoʊ/
- CMU Arpabet: ASCII-based format, uses notation like
T AH M EY T OW
Model Compatibility: Phoneme rules only work with specific ElevenLabs models:
eleven_turbo_v2
eleven_flash_v2
Alias Rules
Alias rules replace words with alternative spellings or phrases. These work with all ElevenLabs models and are useful for:
- Converting acronyms to full phrases (e.g., “UN” → “United Nations”)
- Providing phonetic spellings for difficult words
- Standardizing pronunciation across different contexts
Implementation
Create a Pronunciation Dictionary
Use Vapi’s API to create a pronunciation dictionary with your custom rules.
The API will respond with:
Using Your Own ElevenLabs Account (BYOK)
If you’re using your own ElevenLabs API key (Bring Your Own Key), you can create pronunciation dictionaries directly in your ElevenLabs account and reference them in Vapi:
- Create a pronunciation dictionary in your ElevenLabs account
- Note the
pronunciationDictionaryId
andversionId
from ElevenLabs - Use these IDs in your Vapi assistant configuration:
Managing Pronunciation Dictionaries
List Your Dictionaries
Update Dictionary Rules
Best Practices
- Case Sensitivity: Pronunciation dictionary searches are case-sensitive. Create separate entries for different capitalizations if needed.
- Order Matters: Rules are applied in the order they appear in the dictionary. The first matching rule is used.
- Testing: Always test pronunciation changes with your specific voice and model combination.
- Phoneme Accuracy: Ensure proper stress marking for multi-syllable words when using phoneme rules.
- Model Compatibility: Remember that phoneme rules only work with specific ElevenLabs models.
Common Issues
Pronunciation Not Applied
- Verify you’re using a compatible ElevenLabs model for phoneme rules
- Check that the
stringToReplace
exactly matches the text in your content (case-sensitive) - Ensure the pronunciation dictionary is properly referenced in your voice configuration
SSML Conflicts
- When pronunciation dictionaries are enabled, SSML parsing is automatically activated
- Ensure any existing SSML tags in your content are properly formatted
Performance Impact
- Large dictionaries may slightly increase processing time
- Consider organizing rules by frequency of use for optimal performance