Sesame | Vapi

What is Sesame CSM-1B?

Sesame CSM-1B is an open source text-to-speech (TTS) model that Vapi hosts for seamless integration into your voice applications. This model delivers natural-sounding speech synthesis with a default voice option and voice cloning capabilities.

Key Features:

Vapi-Hosted Solution: Access this open source model directly through Vapi without managing your own infrastructure
Voice Options: Offers a default voice and voice cloning capabilities

Integration Benefits:

Simplified setup with no need to self-host the model
Consistent performance through Vapi’s optimized infrastructure
Seamless compatibility with all Vapi voice applications

Use Cases:

Virtual assistants and conversational AI
Content narration and audio generation
Interactive voice applications
Prototyping voice-driven experiences

Voice Cloning:

Sesame Voice Cloning

Sesame supports voice cloning. To clone a voice:

Navigate to the additional configuration tab (below the voice tab) on the assistants page
Upload a WAV file containing your voice sample
Provide the transcript of the audio file
Name your custom voice

Current Limitations:

The model currently has some limitations. Additional features may be introduced in future updates.