Changelog
New timeoutSeconds Property in Custom LLM Model
- New
timeoutSeconds
Property inCustom LLM Model
: Developers can now specify a custom timeout duration (between 20 and 600 seconds) for connections to their custom language model provider using the newtimeoutSeconds
property. This enhancement allows for better control over response waiting times, accommodating longer operations or varying network conditions.
Enhancements in Assistant Responses, New Gemini Model, and Call Handling
- Introduction of ‘gemini-2.0-flash-lite’ Model Option: You can now use
gemini-2.0-flash-lite
inAssistant.model[provider="google"].model[model="gemini-2.0-flash-lite"]
for a reduced latency, lower cost Gemini model with a 1 million token context window.

- New Assistant Paginated Response: All
Assistant
endpoints now return paginated responses. Each response specifiesitemsPerPage
,totalItems
, andcurrentPage
, which you can use to navigate through a list of assistants.
Blocks Schema Deprecations, Scheduling Enhancements, and New Voice Options for Vapi Voice
-
‘scheduled’ Status Added to Calls and Messages: You can now set the status of a call or message to
scheduled
, allowing it to be executed at a future time. This enables scheduling functionality within your application for calls and messages. -
New Voice Options for Text-to-Speech: Four new voices—
Neha
,Cole
,Harry
, andPaige
—have been added for text-to-speech services. You can enhance user experience by setting thevoiceId
to one of these options in your configurations. -
Removal of Step and Block Schemas:
Blocks and Steps are now officially deprecated. Developers should update their applications to adapt to these changes, possibly by using new or alternative schemas provided.
New Workflows API, Telnyx Phone Number Support, Voice Options, and much more
- Workflows Replace Blocks: The API has migrated from blocks to workflows with new
/workflow
endpoints. Introduction to Workflows You can now useUpdateWorkflowDTO
where conversation components (Say
,Gather
,ApiRequest
,Hangup
,Transfer
nodes) are explicitly connected via edges to create directed conversation flows.
Example workflow (simplified)
-
Telnyx Phone Number Support: Telnyx is now available as a phone number provider alongside Twilio and Vonage.
- Use the
TelnyxPhoneNumber
,CreateTelnyxPhoneNumberDTO
, andUpdateTelnyxPhoneNumberDTO
schemas with/phone-number
endpoints to create and update Telnyx phone numbers. - The
Call.phoneCallProviderId
now includes Telnyx’scallControlId
alongside Twilio’scallSid
and Vonage’sconversationUuid
.
- Use the
-
New Voice Options:
- Vapi Voices: New Vapi voices -
Elliot
,Rohan
,Lily
,Savannah
, andHana
- Hume Voice: New provider with
octave
model and customizable voice settings - Neuphonic Voice: New provider with
neu_hq
(higher quality) andneu_fast
(faster) models
- Vapi Voices: New Vapi voices -
-
New Cerebras Model:
CerebrasModel
Supportsllama3.1-8b
andllama-3.3-70b
models -
Enhanced Transcription:
- New Providers: ElevenLabs and Speechmatics transcribers now available.
- DeepgramTranscriber Numerals: New
numerals
option converts spoken numbers to digits (e.g., “nine-seven-two” → “972”)
-
Improved Voicemail Detection: You can now use multiple provider implementations for
assistant.voicemailDetection
(Google, OpenAI, Twilio). OpenAI implementation allows configuring detection duration (5-60 seconds, default: 15). -
Smart Endpointing Upgrade: Now supports LiveKit as an alternative to Vapi’s custom-trained model in
StartSpeakingPlan.smartEndpointingEnabled
. LiveKit only supports English but may offer different endpointing characteristics. -
Observability with Langfuse: New
assistant.observabilityPlan
property allows integration with Langfuse for tracing and monitoring of assistant calls. Configure with LangfuseObservabilityPlan. -
More Credential Support: Added support for Cerebras, Google, Hume, InflectionAI, Mistral, Trieve, and Neuphonic credentials in
assistant.credentials
Enhanced Voicemail Detection, File Processing, Knowledge Base Integration, and Invoicing Updates
- Track Voicemail Detection Cost, Configure Google and Twilio Voicemail Detection Plans
- You can now configure provider-specific settings and track voicemail detection costs through the new
VoicemailDetectionCost
schema atcall.costs[type=voicemail-detection]
. - Configure Google or Twilio voicemail detection settings using the new
GoogleVoicemailDetectionPlan
andTwilioVoicemailDetectionPlan
schemas.
- Improved File Processing Statuses and Parsed Text Content
- File processing statuses have been renamed to better reflect their purpose:
processing
→done
→failed
. - Two new properties have been added to the
File
schema:parsedTextUrl
andparsedTextBytes
, providing direct access to parsed text content from processed files.
- Google Gemini Models for Knowledge Base Integration
- The
KnowledgeBase
schema now fully supports Google’s Gemini models with specific model options. - You can use Gemini models in your knowledge bases at
assistant.model.tools[type=query].knowledgeBases
.
- New Invoicing Features
- You can now use
InvoicePlan
schema for customizing invoice information with company details. - This can be accessed via the new
invoicePlan
property on theSubscription
schema. - Customize company name, email, tax ID, and address for your invoices.
- Additional Voice Options
- Five new voice options have been added to the
FallbackVapiVoice
schema:Adi
,Julia
,Maibri (Web)
,Maibri (Phone)
, andAshley
. - Configure these voices in your assistant fallback plans at
assistant.voice.fallbackPlan.voices
.

New Query Tool and Vapi Voice Provider, Updates to Language Support and Error Handling
- New Query Tool Feature and Knowledge Base Integration
- The API now supports a new query tool that allows assistants to search through knowledge bases. Add this tool to any assistant model by configuring it at
assistant.model.tools[type=query]
path. - You can now link knowledge bases to query tools, providing structured information sources for assistants to access. Define knowledge bases with a name, model, provider, description, and associated file IDs.
Example configuration for QueryTool
QueryTool
- New Voice Provider Support
A new voice provider “vapi” has been added with support for a voice called “Jordan” in FallbackVapiVoice
. Configure it in our assistant fallback plans at assistant.voice.fallbackPlan.voices
.

- Language Support Updates
Myanmar language (“my”) has been added to supported languages, while “jp” and “mymr” codes have been removed. Use “ja” for Japanese language and “my” for Myanmar. Reference GladiaTranscriber
for more language codes.
- Error Handling Improvements
Added new error code pipeline-error-11labs-transcriber-failed
for ServerMessageStatusUpdate.endedReason
and ServerMessageEndOfCallReport.endedReason
. Also added an explicit failed
status for test suite runs in TestSuiteRun
. These additions provide more detailed error reporting.
- Azure OpenAI Model Update
The model gpt-4o-2024-08-06-ptu
has been removed from Azure OpenAI credential schemas. Update any credential configurations that were using this model.
Claude 3.7 Sonnet and GPT 4.5 preview, New Hume AI Voice Provider, New Supabase Storage Provider, Enhanced Call Transfer Options
- Claude 3.7 Sonnet with Thinking Configuration Support:
You can now use the latest claude-3-7-sonnet-20250219 model with a new “thinking” feature via the
AnthropicThinkingConfig
schema. Configure it inassistant.model
orcall.squad.members.assistant.model
:
- OpenAI GPT-4.5-Preview Support:
You can now use the latest gpt-4.5-preview model as a primary model or fallback option via the
OpenAIModel
schema. Configure it inassistant.model
orcall.squad.members.assistant.model
:
- New Hume Voice Provider: Integrated Hume AI as a new voice provider with the “octave” model for text-to-speech synthesis.

-
Supabase Storage Integration: New Supabase S3-compatible storage support for file operations. This integration lets developers configure buckets and paths across 16 regions, enabling structured file storage with proper authentication. Configure
SupabaseBucketPlan
inassistant.credentials.bucketPlan
,call.squad.members.assistant.credentials.bucketPlan
-
Voice Speed Control Added a speed parameter to ElevenLabs voices ranging from 0.7 (slower) to 1.2 (faster)
ElevenLabsVoice
. This enhancement gives developers more control over speech cadence for more natural-sounding conversations. -
Enhanced Call Transfer Options in TransferPlan Added a new dial option to the sipVerb parameter for call transfers. This complements the existing refer (default) and bye options, providing more flexibility in call handling.
- ‘dial’: Uses SIP DIAL to transfer the call
-
Zero-Value Minumum Subscription Minutes Changed the minimum value for minutesUsed and minutesIncluded from 1 to 0. This supports tracking of new subscriptions and free tiers with no included minutes.
-
Zero-Value Minimum KeypadInputPlan Timeout Adjusted the KeypadInputPlan.timeoutSeconds minimum from 0.5 to 0.
Phone Keypad Input Support, OAuth2 and Analytics Improvements
- Keypad Input Support for Phone Calls: A new
keypadInputPlan
feature has been added to enable handling of DTMF (touch-tone) keypad inputs during phone calls. This allows your voice assistant to collect numeric input from callers, like account numbers, menu selections, or confirmation codes.
Configuration options:
The feature can be configured in:
assistant.keypadInputPlan
call.squad.members.assistant.keypadInputPlan
call.squad.members.assistantOverrides.keypadInputPlan
- OAuth2 Authentication Enhancement: The
OAuth2AuthenticationPlan
now includes ascope
property to specify access scopes when authenticating. This allows more granular control over permissions when integrating with OAuth2-based services.
The scope property can be configured at:
assistant.credentials.authenticationPlan
call.squad.members.assistant.credentials.authenticationPlan
-
New Analytics Metric: Minutes Used The
AnalyticsOperation
schema now includes a new column option:minutesUsed
. This metric allows you to track and analyze the duration of calls in your usage reports and analytics dashboards. -
Removed TrieveKnowledgeBaseCreate Schema: Removed
TrieveKnowledgeBaseCreate
schema from
TrieveKnowledgeBase.createPlan
CreateTrieveKnowledgeBaseDTO.createPlan
UpdateTrieveKnowledgeBaseDTO.createPlan
Test Suite APIs, Enhanced Call Transfers, Voice Model Enhancements
- Introducing Test Suite Management APIs: You can now test your assistant conversations before deploying them by creating end-to-end tests, adding test cases, and running and reviewing test suites. You can configure these tests through the Test Suites dashboard page and Test Suite APIs, and learn more in the docs.

-
Enhanced Call Transfers with TwiML Control: You can now use
twiml
(Twilio Markup Language) inAssistant.model.tools[type=transferCall].destinations[].transferPlan[mode=warm-transfer-twiml]
to execute TwiML instructions before connecting the call, allowing for pre-transfer announcements or data collection with Twilio. -
New Voice Models and Experimental Controls:
mistv2
Rime AI Voice: You can now use themistv2
model inAssistant.voice[provider="rime-ai"].model[model="mistv2"]
.- OpenAI Models: You can now use
chatgpt-4o-latest
model inAssistant.model[provider="openai"].model[model="chatgpt-4o-latest"]
.
-
Experimental Controls for Cartesia Voices: You can now specify your Cartesia voice speed (string) and emotional range (array) with
Assistant.voice[provider="cartesia"].experimentalControls
. For example:
What’s New
-
Configure 16 text normalization processors in FormatPlan: You can now control how text is transcribed and spoken for currency, dates, etc. by setting the
formattersEnabled
array inAssistant.voice.chunkPlan.formatPlan
(not specifyingformattersEnabled
defaults to all formatters being enabled). See all available formatters in the FormatPlan.formattersEnabled reference. -
Deepgram Keyterm Prompting: The
keyterm
array in DeepgramTranscriber implements Deepgram’s Keyterm Prompting technology, boosting recall for domain-specific terminology. Compared to the existingkeywords
field:
You should reserve keyterm
for compliance-sensitive terms like medical codes while using keywords
for proper nouns / brand names.
-
Subscription usage tracking improvements: The
minutesUsedNextResetAt
timestamp now appears in all subscription tiers (not just enterprise), exposed atsubscription.minutesUsedNextResetAt
for predictable billing cycle integration. Combine with existingminutesUsed
andminutesIncluded
metrics to build custom usage dashboards, regardless of subscription tier. -
Neuphonic voice synthesis: You can now configure Neuphonic as a voice provider with
Assistant.voice[provider="neuphonic"]
. Handle appropriate errors withpipeline-error-neuphonic-voice-failed
. Test latency thresholds as Neuphonic requires 200ms additional processing time compared to ElevenLabs.

- Support for pre-transfer announcements in ClientInboundMessageTransfer: The
content
field inClientInboundMessageTransfer
now supports pre-transfer announcements (“Connecting you to billing…”) before SIP/number routing. Implement via WebSocket messages using type: “transfer” with destination object.