OpenAI Realtime

The Realtime API is currently in beta, and not recommended for production use by OpenAI. We’re excited to have you try this new feature and welcome your feedback as we continue to refine and improve the experience.

OpenAI’s Realtime API enables developers to use a native speech-to-speech model. Unlike other Vapi configurations which orchestrate a transcriber, model and voice API to simulate speech-to-speech, OpenAI’s Realtime API natively processes audio in and audio out.

To start using it with your Vapi assistants, select gpt-4o-realtime-preview-2024-12-17 as your model.

Please note that only OpenAI voices may be selected while using this model. The voice selection will not act as a TTS (text-to-speech) model, but rather as the voice used within the speech-to-speech model.
Also note that we don’t currently support Knowledge Bases with the Realtime API.
Lastly, note that our Realtime integration still retains the rest of Vapi’s orchestration layer such as Endpointing and Interruption models to enable a reliable conversational flow.