Vapi’s multilingual support is primarily facilitated through transcribers, which are part of the speech-to-text process. The pipeline consists of three key elements: text-to-speech, speech-to-text, and the llm model, which acts as the brain of the operation. Each of these elements can be customized using different providers.

Transcribers (Speech-to-Text)

Currently, Vapi supports two providers for speech-to-text transcriptions:

  • Deepgram (nova - family models)
  • Talkscriber (whisper model)

Each provider supports different languages. For more detailed information, you can visit your dashboard and navigate to the transcribers tab on the assistant page. Here, you can see the languages supported by each provider and the available models. Note that not all models support all languages. For specific details, you can refer to the documentation for the corresponding providers.

Voice (Text-to-Speech)

Once you have set your transcriber and corresponding language, you can choose a voice for text-to-speech in that language. For example, you can choose a voice with a Spanish accent if needed.

Vapi currently supports the following providers for text-to-speech:

  • PlayHT
  • 11labs
  • Rime-ai
  • Deepgram
  • OpenAI
  • Azure
  • Lmnt
  • Neets

Each provider offers varying degrees of language support. Azure, for instance, supports the most languages, with approximately 400 prebuilt voices across 140 languages and variants. You can also create your own custom languages with other providers.

Multilingual Support

For multilingual support, you can choose providers like Eleven Labs or Azure, which have models and voices designed for this purpose. This allows your voice assistant to understand and respond in multiple languages, enhancing the user experience for non-English speakers.

To set up multilingual support, you no longer need to specify the desired language when configuring the voice assistant. This configuration in the voice section is deprecated.

Instead, you directly choose a voice that supports the desired language from your voice provider. This can be done when you are setting up or modifying your voice assistant.

Here is an example of how to set up a voice assistant that speaks Spanish:

  "voice": {
    "provider": "azure",
    "voiceId": "es-ES-ElviraNeural"

In this example, the voice es-ES-ElviraNeural from the provider azure supports Spanish. You can replace es-ES-ElviraNeural with the ID of any other voice that supports your desired language.

By leveraging Vapi’s multilingual support, you can make your voice assistant more accessible and user-friendly, reaching a wider audience and providing a better user experience.