Multilingual support

Overview

Configure your voice assistant to communicate in multiple languages with automatic language detection, native voice quality, and cultural context awareness.

In this guide, you’ll learn to:

Set up automatic language detection for speech recognition
Configure multilingual voice synthesis
Design language-aware system prompts
Test and optimize multilingual performance

Multilingual Support: Multiple providers support automatic language detection. Deepgram (Nova 2, Nova 3 with “Multi” setting) and Google STT (with “Multilingual” setting) both offer automatic language detection for seamless multilingual conversations.

Configure automatic language detection

Set up your transcriber to automatically detect and process multiple languages.

Dashboard

TypeScript (Server SDK)

Python (Server SDK)

cURL

Navigate to Assistants in your Vapi Dashboard
Create a new assistant or edit an existing one
In the Transcriber section:
- Provider: Select Deepgram (recommended) or Google
- Model: For Deepgram, choose Nova 2 or Nova 3; for Google, choose Latest
- Language: Set to Multi (Deepgram) or Multilingual (Google)
Other providers: Single language only, no automatic detection
Click Save to apply the configuration

Provider Performance: Deepgram offers the best balance of speed and multilingual accuracy. Google provides broader language support but may be slower. Both providers support automatic language detection within conversations.

Set up multilingual voices

Configure your assistant to use appropriate voices for each detected language.

Dashboard

TypeScript (Server SDK)

Python (Server SDK)

cURL

In the Voice section of your assistant:
- Provider: Select Azure (best multilingual coverage)
- Voice: Choose multilingual-auto for automatic voice selection
Alternative: Configure specific voices for each language:
- Select a primary voice (e.g., en-US-AriaNeural)
- Click Add Fallback Voices
- Add voices for other languages:
  - Spanish: es-ES-ElviraNeural
  - French: fr-FR-DeniseNeural
  - German: de-DE-KatjaNeural
Click Save to apply the voice configuration

Voice Provider Support: Unlike transcription, all major voice providers (Azure, ElevenLabs, OpenAI, etc.) support multiple languages. Azure offers the most comprehensive coverage with 400+ voices across 140+ languages.

Configure language-aware prompts

Create system prompts that explicitly list supported languages and handle multiple languages gracefully.

Dashboard

TypeScript (Server SDK)

Python (Server SDK)

cURL

In the Model section, update your system prompt to explicitly list supported languages:

You are a helpful assistant that can communicate in English, Spanish, and French.
Language Instructions:
- You can speak and understand: English, Spanish, and French
- Automatically detect and respond in the user's language
- Switch languages seamlessly when the user changes languages
- Maintain consistent personality across all languages
- Use culturally appropriate greetings and formality levels
If a user speaks a language other than English, Spanish, or French, politely explain that you only support these three languages and ask them to continue in one of them.

Click Save to apply the prompt changes

Critical for Multilingual Success: You must explicitly list the supported languages in your system prompt. Assistants struggle to understand they can speak multiple languages without this explicit instruction.

Add multilingual greetings

Configure greeting messages that work across multiple languages.

Dashboard

TypeScript (Server SDK)

Python (Server SDK)

cURL

In the First Message field, enter a multilingual greeting:

Hello! I can assist you in English, Spanish, or French. How can I help you today?

Optional: For more personalized greetings, use the Advanced Message Configuration:
- Enable Language-Specific Messages
- Add greetings for each target language
Click Save to apply the greeting

Test your multilingual assistant

Validate your configuration with different languages and scenarios.

Dashboard

TypeScript (Server SDK)

Python (Server SDK)

cURL

Use the Test Assistant feature in your dashboard
Test these scenarios:
- Start conversations in different languages
- Switch languages mid-conversation
- Use mixed-language input
Monitor the Call Analytics for:
- Language detection accuracy
- Voice quality consistency
- Response appropriateness
Adjust configuration based on test results

Provider capabilities (Accurate as of testing)

Speech Recognition (Transcription)

Provider	Multilingual Support	Languages	Notes
Deepgram	✅ Full auto-detection	100+	Recommended: Nova 2/Nova 3 with “Multi” language setting
Google STT	✅ Full auto-detection	125+	Latest models with “Multilingual” language setting
Assembly AI	❌ English only	English	No multilingual support
Azure STT	❌ Single language	100+	Many languages, but no auto-detection
OpenAI Whisper	❌ Single language	90+	Many languages, but no auto-detection
Gladia	❌ Single language	80+	Many languages, but no auto-detection
Speechmatics	❌ Single language	50+	Many languages, but no auto-detection
Talkscriber	❌ Single language	40+	Many languages, but no auto-detection

Voice Synthesis (Text-to-Speech)

Provider	Languages	Multilingual Voice Selection	Best For
Azure	140+	✅ Automatic	Maximum language coverage
ElevenLabs	30+	✅ Automatic	Premium voice quality
OpenAI TTS	50+	✅ Automatic	Consistent quality across languages
PlayHT	80+	✅ Automatic	Cost-effective scaling

Common challenges and solutions

Language detection is inaccurate

Solutions:

Use Deepgram (Nova 2/Nova 3 with “Multi”) or Google STT (with “Multilingual”)
Ensure high-quality audio input for better detection accuracy
Test with native speakers of target languages
Consider provider-specific language combinations for optimal results

Assistant doesn't realize it can speak multiple languages

Solutions:

Explicitly list all supported languages in your system prompt
Include language capabilities in the assistant’s instructions
Test the prompt with multilingual conversations
Avoid generic “multilingual” statements without specifics

Transcription is too slow

Solutions:

Use Deepgram Nova 2/Nova 3 for optimal speed and multilingual support
For Google STT, use latest models for better performance
Consider the speed vs accuracy tradeoff for your use case
Optimize audio quality and format to improve processing speed

Voice quality varies between languages

Solutions:

Test different voice providers for each language
Use Azure for maximum language coverage
Configure fallback voices as backup options
Consider premium providers for key languages

Next steps

Now that you have multilingual support configured:

Build a complete multilingual agent: Follow our step-by-step implementation guide
Custom voices: Set up region-specific custom voices
System prompting: Design effective multilingual prompts
Call analysis: Monitor language performance and usage