OpenAI Realtime
Build voice assistants with OpenAI’s native speech-to-speech models for ultra-low latency conversations
Build voice assistants with OpenAI’s native speech-to-speech models for ultra-low latency conversations
OpenAI’s Realtime API enables developers to use a native speech-to-speech model. Unlike other Vapi configurations which orchestrate a transcriber, model and voice API to simulate speech-to-speech, OpenAI’s Realtime API natively processes audio in and audio out.
In this guide, you’ll learn to:
The gpt-realtime-2025-08-28 model is production-ready.
OpenAI offers three realtime models, each with different capabilities and cost/performance trade-offs:
Realtime models support a specific set of OpenAI voices optimized for speech-to-speech:
Available across all realtime models:
alloy - Neutral and balancedecho - Warm and engagingshimmer - Energetic and expressiveOnly available with realtime models:
marin - Professional and clearcedar - Natural and conversationalThe following voices are NOT supported by realtime models: ash, ballad, coral, fable, onyx, and nova.
Configure a realtime assistant with function calling:
To use the enhanced voices only available with realtime models:
Unlike traditional OpenAI models, realtime models receive instructions through the session configuration. Vapi automatically converts your system messages to session instructions during WebSocket initialization.
The system message in your model configuration is automatically optimized for realtime processing:
Realtime models benefit from different prompting techniques than text-based models. These guidelines are based on OpenAI’s official prompting guide.
Organize your prompts with clear sections for better model comprehension:
Control the model’s speaking pace with explicit instructions:
Transitioning from standard STT/TTS to realtime models:
Ensure your selected voice is supported (alloy, echo, shimmer, marin, or cedar)
Best for production workloads requiring:
Best for development and testing:
Best for cost-sensitive applications:
Handle edge cases gracefully:
Be aware of these limitations when implementing realtime models:
Now that you understand OpenAI Realtime models: