Multilingual support

Enable voice assistants to speak multiple languages fluently

Overview

Configure your voice assistant to communicate in multiple languages with automatic language detection, native voice quality, and cultural context awareness.

In this guide, you’ll learn to:

  • Set up automatic language detection for speech recognition
  • Configure multilingual voice synthesis
  • Design language-aware system prompts
  • Test and optimize multilingual performance

Multilingual Support: Multiple providers support automatic language detection. Deepgram (Nova 2, Nova 3 with “Multi” setting) and Google STT (with “Multilingual” setting) both offer automatic language detection for seamless multilingual conversations.

Configure automatic language detection

Set up your transcriber to automatically detect and process multiple languages.

  1. Navigate to Assistants in your Vapi Dashboard
  2. Create a new assistant or edit an existing one
  3. In the Transcriber section:
    • Provider: Select Deepgram (recommended) or Google
    • Model: For Deepgram, choose Nova 2 or Nova 3; for Google, choose Latest
    • Language: Set to Multi (Deepgram) or Multilingual (Google)
  4. Other providers: Single language only, no automatic detection
  5. Click Save to apply the configuration

Provider Performance: Deepgram offers the best balance of speed and multilingual accuracy. Google provides broader language support but may be slower. Both providers support automatic language detection within conversations.

Set up multilingual voices

Configure your assistant to use appropriate voices for each detected language.

  1. In the Voice section of your assistant:
    • Provider: Select Azure (best multilingual coverage)
    • Voice: Choose multilingual-auto for automatic voice selection
  2. Alternative: Configure specific voices for each language:
    • Select a primary voice (e.g., en-US-AriaNeural)
    • Click Add Fallback Voices
    • Add voices for other languages:
      • Spanish: es-ES-ElviraNeural
      • French: fr-FR-DeniseNeural
      • German: de-DE-KatjaNeural
  3. Click Save to apply the voice configuration

Voice Provider Support: Unlike transcription, all major voice providers (Azure, ElevenLabs, OpenAI, etc.) support multiple languages. Azure offers the most comprehensive coverage with 400+ voices across 140+ languages.

Configure language-aware prompts

Create system prompts that explicitly list supported languages and handle multiple languages gracefully.

  1. In the Model section, update your system prompt to explicitly list supported languages:
You are a helpful assistant that can communicate in English, Spanish, and French.
Language Instructions:
- You can speak and understand: English, Spanish, and French
- Automatically detect and respond in the user's language
- Switch languages seamlessly when the user changes languages
- Maintain consistent personality across all languages
- Use culturally appropriate greetings and formality levels
If a user speaks a language other than English, Spanish, or French, politely explain that you only support these three languages and ask them to continue in one of them.
  1. Click Save to apply the prompt changes

Critical for Multilingual Success: You must explicitly list the supported languages in your system prompt. Assistants struggle to understand they can speak multiple languages without this explicit instruction.

Add multilingual greetings

Configure greeting messages that work across multiple languages.

  1. In the First Message field, enter a multilingual greeting:
Hello! I can assist you in English, Spanish, or French. How can I help you today?
  1. Optional: For more personalized greetings, use the Advanced Message Configuration:
    • Enable Language-Specific Messages
    • Add greetings for each target language
  2. Click Save to apply the greeting

Test your multilingual assistant

Validate your configuration with different languages and scenarios.

  1. Use the Test Assistant feature in your dashboard
  2. Test these scenarios:
    • Start conversations in different languages
    • Switch languages mid-conversation
    • Use mixed-language input
  3. Monitor the Call Analytics for:
    • Language detection accuracy
    • Voice quality consistency
    • Response appropriateness
  4. Adjust configuration based on test results

Provider capabilities (Accurate as of testing)

Speech Recognition (Transcription)

ProviderMultilingual SupportLanguagesNotes
Deepgram✅ Full auto-detection100+Recommended: Nova 2/Nova 3 with “Multi” language setting
Google STT✅ Full auto-detection125+Latest models with “Multilingual” language setting
Assembly AI❌ English onlyEnglishNo multilingual support
Azure STT❌ Single language100+Many languages, but no auto-detection
OpenAI Whisper❌ Single language90+Many languages, but no auto-detection
Gladia❌ Single language80+Many languages, but no auto-detection
Speechmatics❌ Single language50+Many languages, but no auto-detection
Talkscriber❌ Single language40+Many languages, but no auto-detection

Voice Synthesis (Text-to-Speech)

ProviderLanguagesMultilingual Voice SelectionBest For
Azure140+✅ AutomaticMaximum language coverage
ElevenLabs30+✅ AutomaticPremium voice quality
OpenAI TTS50+✅ AutomaticConsistent quality across languages
PlayHT80+✅ AutomaticCost-effective scaling

Common challenges and solutions

Solutions:

  • Use Deepgram (Nova 2/Nova 3 with “Multi”) or Google STT (with “Multilingual”)
  • Ensure high-quality audio input for better detection accuracy
  • Test with native speakers of target languages
  • Consider provider-specific language combinations for optimal results

Solutions:

  • Explicitly list all supported languages in your system prompt
  • Include language capabilities in the assistant’s instructions
  • Test the prompt with multilingual conversations
  • Avoid generic “multilingual” statements without specifics

Solutions:

  • Use Deepgram Nova 2/Nova 3 for optimal speed and multilingual support
  • For Google STT, use latest models for better performance
  • Consider the speed vs accuracy tradeoff for your use case
  • Optimize audio quality and format to improve processing speed

Solutions:

  • Test different voice providers for each language
  • Use Azure for maximum language coverage
  • Configure fallback voices as backup options
  • Consider premium providers for key languages

Next steps

Now that you have multilingual support configured: