Changelog

Get the (almost) daily changelog

Enhancements in Assistant Responses, New Gemini Model, and Call Handling

  1. Introduction of ‘gemini-2.0-flash-lite’ Model Option: You can now use gemini-2.0-flash-lite in Assistant.model[provider="google"].model[model="gemini-2.0-flash-lite"] for a reduced latency, lower cost Gemini model with a 1 million token context window.
gemini-2.0-flash-lite Model Option
gemini-2.0-flash-lite Model Option
  1. New Assistant Paginated Response: All Assistant endpoints now return paginated responses. Each response specifies itemsPerPage, totalItems, and currentPage, which you can use to navigate through a list of assistants.

Blocks Schema Deprecations, Scheduling Enhancements, and New Voice Options for Vapi Voice

  1. ‘scheduled’ Status Added to Calls and Messages: You can now set the status of a call or message to scheduled, allowing it to be executed at a future time. This enables scheduling functionality within your application for calls and messages.

  2. New Voice Options for Text-to-Speech: Four new voices—Neha, Cole, Harry, and Paige—have been added for text-to-speech services. You can enhance user experience by setting the voiceId to one of these options in your configurations.

  3. Removal of Step and Block Schemas:

    Blocks and Steps are now officially deprecated. Developers should update their applications to adapt to these changes, possibly by using new or alternative schemas provided.

New Workflows API, Telnyx Phone Number Support, Voice Options, and much more

  1. Workflows Replace Blocks: The API has migrated from blocks to workflows with new /workflow endpoints. Introduction to Workflows You can now use UpdateWorkflowDTO where conversation components (Say, Gather, ApiRequest, Hangup, Transfer nodes) are explicitly connected via edges to create directed conversation flows.
1{
2 "name": "Customer Support Workflow",
3 "nodes": [
4 {
5 "id": "greeting",
6 "type": "Say",
7 "text": "Hello, welcome to customer support. Do you need help with billing or technical issues?"
8 },
9 {
10 "id": "menu",
11 "type": "Gather",
12 "options": ["billing", "technical", "other"]
13 },
14 {
15 "id": "billing",
16 "type": "Say",
17 "text": "I'll connect you with our billing department."
18 },
19 {
20 "id": "technical",
21 "type": "Say",
22 "text": "I'll connect you with our technical support team."
23 },
24 {
25 "id": "transfer_billing",
26 "type": "Transfer",
27 "destination": {
28 "type": "number",
29 "number": "+1234567890"
30 }
31 },
32 {
33 "id": "transfer_technical",
34 "type": "Transfer",
35 "destination": {
36 "type": "number",
37 "number": "+1987654321"
38 }
39 }
40 ],
41 "edges": [
42 {
43 "from": "greeting",
44 "to": "menu"
45 },
46 {
47 "from": "menu",
48 "to": "billing",
49 "condition": {
50 "type": "logic",
51 "liquid": "{% if input == 'billing' %} true {% endif %}"
52 }
53 },
54 {
55 "from": "menu",
56 "to": "technical",
57 "condition": {
58 "type": "logic",
59 "liquid": "{% if input == 'technical' %} true {% endif %}"
60 }
61 },
62 {
63 "from": "billing",
64 "to": "transfer_billing"
65 },
66 {
67 "from": "technical",
68 "to": "transfer_technical"
69 }
70 ]
71}
  1. Telnyx Phone Number Support: Telnyx is now available as a phone number provider alongside Twilio and Vonage.

  2. New Voice Options:

    • Vapi Voices: New Vapi voices - Elliot, Rohan, Lily, Savannah, and Hana
    • Hume Voice: New provider with octave model and customizable voice settings
    • Neuphonic Voice: New provider with neu_hq (higher quality) and neu_fast (faster) models
  3. New Cerebras Model: CerebrasModel Supports llama3.1-8b and llama-3.3-70b models

  4. Enhanced Transcription:

    • New Providers: ElevenLabs and Speechmatics transcribers now available.
    • DeepgramTranscriber Numerals: New numerals option converts spoken numbers to digits (e.g., “nine-seven-two” → “972”)
  5. Improved Voicemail Detection: You can now use multiple provider implementations for assistant.voicemailDetection (Google, OpenAI, Twilio). OpenAI implementation allows configuring detection duration (5-60 seconds, default: 15).

  6. Smart Endpointing Upgrade: Now supports LiveKit as an alternative to Vapi’s custom-trained model in StartSpeakingPlan.smartEndpointingEnabled. LiveKit only supports English but may offer different endpointing characteristics.

  7. Observability with Langfuse: New assistant.observabilityPlan property allows integration with Langfuse for tracing and monitoring of assistant calls. Configure with LangfuseObservabilityPlan.

  8. More Credential Support: Added support for Cerebras, Google, Hume, InflectionAI, Mistral, Trieve, and Neuphonic credentials in assistant.credentials

Enhanced Voicemail Detection, File Processing, Knowledge Base Integration, and Invoicing Updates

  1. Track Voicemail Detection Cost, Configure Google and Twilio Voicemail Detection Plans
  • You can now configure provider-specific settings and track voicemail detection costs through the new VoicemailDetectionCost schema at call.costs[type=voicemail-detection].
  • Configure Google or Twilio voicemail detection settings using the new GoogleVoicemailDetectionPlan and TwilioVoicemailDetectionPlan schemas.
1// Google configuration example
2{
3 "provider": "google",
4 "voicemailExpectedDurationSeconds": 15 // Range: 5-60 seconds
5}
1// Twilio configuration example
2{
3 "provider": "twilio",
4 "enabled": true,
5 "machineDetectionTimeout": 30, // Range: 3-59 seconds
6 "voicemailDetectionTypes": ["machine_end_beep", "machine_end_silence"]
7}
  1. Improved File Processing Statuses and Parsed Text Content
  • File processing statuses have been renamed to better reflect their purpose: processingdonefailed.
  • Two new properties have been added to the File schema: parsedTextUrl and parsedTextBytes, providing direct access to parsed text content from processed files.
  1. Google Gemini Models for Knowledge Base Integration
  • The KnowledgeBase schema now fully supports Google’s Gemini models with specific model options.
  • You can use Gemini models in your knowledge bases at assistant.model.tools[type=query].knowledgeBases.
1"model": {
2 "enum": [
3 "gemini-2.0-flash-thinking-exp",
4 "gemini-2.0-pro-exp-02-05",
5 "gemini-2.0-flash",
6 "gemini-2.0-flash-lite-preview-02-05",
7 "gemini-2.0-flash-exp",
8 "gemini-2.0-flash-realtime-exp",
9 "gemini-1.5-flash",
10 "gemini-1.5-flash-002",
11 "gemini-1.5-pro",
12 "gemini-1.5-pro-002",
13 "gemini-1.0-pro"
14 ]
15}
  1. New Invoicing Features
  • You can now use InvoicePlan schema for customizing invoice information with company details.
  • This can be accessed via the new invoicePlan property on the Subscription schema.
  • Customize company name, email, tax ID, and address for your invoices.
  1. Additional Voice Options
  • Five new voice options have been added to the FallbackVapiVoice schema: Adi, Julia, Maibri (Web), Maibri (Phone), and Ashley.
  • Configure these voices in your assistant fallback plans at assistant.voice.fallbackPlan.voices.
Additional Vapi Voices
Additional Vapi Voices

New Query Tool and Vapi Voice Provider, Updates to Language Support and Error Handling

  1. New Query Tool Feature and Knowledge Base Integration
  • The API now supports a new query tool that allows assistants to search through knowledge bases. Add this tool to any assistant model by configuring it at assistant.model.tools[type=query] path.
  • You can now link knowledge bases to query tools, providing structured information sources for assistants to access. Define knowledge bases with a name, model, provider, description, and associated file IDs.
1{
2 "type": "query",
3 "async": false,
4 "server": {
5 "url": "https://api.example.com/query-handler"
6 },
7 "function": {
8 "name": "query_knowledge",
9 "description": "Query knowledge bases for information",
10 "parameters": {
11 "type": "object",
12 "properties": {
13 "query": {
14 "type": "string",
15 "description": "The query to search for"
16 }
17 },
18 "required": ["query"]
19 }
20 },
21 "knowledgeBases": [
22 {
23 "name": "Product Documentation",
24 "model": "gemini-1.5-flash",
25 "provider": "google",
26 "description": "Contains all product manuals",
27 "fileIds": ["file-123", "file-456"]
28 }
29 ]
30}
  1. New Voice Provider Support

A new voice provider “vapi” has been added with support for a voice called “Jordan” in FallbackVapiVoice. Configure it in our assistant fallback plans at assistant.voice.fallbackPlan.voices.

Vapi Voice Provider
Vapi Voice Provider
  1. Language Support Updates

Myanmar language (“my”) has been added to supported languages, while “jp” and “mymr” codes have been removed. Use “ja” for Japanese language and “my” for Myanmar. Reference GladiaTranscriber for more language codes.

  1. Error Handling Improvements

Added new error code pipeline-error-11labs-transcriber-failed for ServerMessageStatusUpdate.endedReason and ServerMessageEndOfCallReport.endedReason. Also added an explicit failed status for test suite runs in TestSuiteRun. These additions provide more detailed error reporting.

  1. Azure OpenAI Model Update

The model gpt-4o-2024-08-06-ptu has been removed from Azure OpenAI credential schemas. Update any credential configurations that were using this model.

Claude 3.7 Sonnet and GPT 4.5 preview, New Hume AI Voice Provider, New Supabase Storage Provider, Enhanced Call Transfer Options

  1. Claude 3.7 Sonnet with Thinking Configuration Support: You can now use the latest claude-3-7-sonnet-20250219 model with a new “thinking” feature via the AnthropicThinkingConfig schema. Configure it in assistant.model or call.squad.members.assistant.model:
1{
2 "model": "claude-3-7-sonnet-20250219",
3 "provider": "anthropic",
4 "thinking": {
5 "type": "enabled",
6 "budgetTokens": 5000 // min 1024, max 100000
7 }
8}
  1. OpenAI GPT-4.5-Preview Support: You can now use the latest gpt-4.5-preview model as a primary model or fallback option via the OpenAIModel schema. Configure it in assistant.model or call.squad.members.assistant.model:
1{
2 "model": "gpt-4.5-preview",
3 "provider": "openai"
4}
  1. New Hume Voice Provider: Integrated Hume AI as a new voice provider with the “octave” model for text-to-speech synthesis.
Hume Voice Provider
Hume Voice Provider
  1. Supabase Storage Integration: New Supabase S3-compatible storage support for file operations. This integration lets developers configure buckets and paths across 16 regions, enabling structured file storage with proper authentication. Configure SupabaseBucketPlan in assistant.credentials.bucketPlan,call.squad.members.assistant.credentials.bucketPlan

  2. Voice Speed Control Added a speed parameter to ElevenLabs voices ranging from 0.7 (slower) to 1.2 (faster) ElevenLabsVoice. This enhancement gives developers more control over speech cadence for more natural-sounding conversations.

  3. Enhanced Call Transfer Options in TransferPlan Added a new dial option to the sipVerb parameter for call transfers. This complements the existing refer (default) and bye options, providing more flexibility in call handling.

  • ‘dial’: Uses SIP DIAL to transfer the call
  1. Zero-Value Minumum Subscription Minutes Changed the minimum value for minutesUsed and minutesIncluded from 1 to 0. This supports tracking of new subscriptions and free tiers with no included minutes.

  2. Zero-Value Minimum KeypadInputPlan Timeout Adjusted the KeypadInputPlan.timeoutSeconds minimum from 0.5 to 0.

Phone Keypad Input Support, OAuth2 and Analytics Improvements

  1. Keypad Input Support for Phone Calls: A new keypadInputPlan feature has been added to enable handling of DTMF (touch-tone) keypad inputs during phone calls. This allows your voice assistant to collect numeric input from callers, like account numbers, menu selections, or confirmation codes.

Configuration options:

1{
2 "keypadInputPlan": {
3 "enabled": true, // Default: false
4 "delimiters": "#", // Options: "#", "*", or "" (empty string)
5 "timeoutSeconds": 2 // Range: 0.5-10 seconds, Default: 2
6 }
7}

The feature can be configured in:

  • assistant.keypadInputPlan
  • call.squad.members.assistant.keypadInputPlan
  • call.squad.members.assistantOverrides.keypadInputPlan
  1. OAuth2 Authentication Enhancement: The OAuth2AuthenticationPlan now includes a scope property to specify access scopes when authenticating. This allows more granular control over permissions when integrating with OAuth2-based services.
1{
2 "credentials": [
3 {
4 "authenticationPlan": {
5 "type": "oauth2",
6 "url": "https://example.com/oauth2/token",
7 "clientId": "your-client-id",
8 "clientSecret": "your-client-secret",
9 "scope": "read:data" // New property, max length: 1000 characters
10 }
11 }
12 ]
13}

The scope property can be configured at:

  • assistant.credentials.authenticationPlan
  • call.squad.members.assistant.credentials.authenticationPlan
  1. New Analytics Metric: Minutes Used The AnalyticsOperation schema now includes a new column option: minutesUsed. This metric allows you to track and analyze the duration of calls in your usage reports and analytics dashboards.

  2. Removed TrieveKnowledgeBaseCreate Schema: Removed TrieveKnowledgeBaseCreate schema from

  • TrieveKnowledgeBase.createPlan
  • CreateTrieveKnowledgeBaseDTO.createPlan
  • UpdateTrieveKnowledgeBaseDTO.createPlan

Test Suite APIs, Enhanced Call Transfers, Voice Model Enhancements

  1. Introducing Test Suite Management APIs: You can now test your assistant conversations before deploying them by creating end-to-end tests, adding test cases, and running and reviewing test suites. You can configure these tests through the Test Suites dashboard page and Test Suite APIs, and learn more in the docs.
Test Suite Management APIs
Test Suite Management APIs
  1. Enhanced Call Transfers with TwiML Control: You can now use twiml (Twilio Markup Language) in Assistant.model.tools[type=transferCall].destinations[].transferPlan[mode=warm-transfer-twiml] to execute TwiML instructions before connecting the call, allowing for pre-transfer announcements or data collection with Twilio.

  2. New Voice Models and Experimental Controls:

  3. Experimental Controls for Cartesia Voices: You can now specify your Cartesia voice speed (string) and emotional range (array) with Assistant.voice[provider="cartesia"].experimentalControls. For example:

1{
2 "speed": "fast",
3 "emotion": [
4 "anger:lowest",
5 "curiosity:high"
6 ]
7}
PropertyOption
speedslowest
slow
normal (default)
fast
fastest
emotionanger:lowest
anger:low
anger:high
anger:highest
positivity:lowest
positivity:low
positivity:high
positivity:highest
surprise:lowest
surprise:low
surprise:high
surprise:highest
sadness:lowest
sadness:low
sadness:high
sadness:highest
curiosity:lowest
curiosity:low
curiosity:high
curiosity:highest

What’s New

  1. Configure 16 text normalization processors in FormatPlan: You can now control how text is transcribed and spoken for currency, dates, etc. by setting the formattersEnabled array in Assistant.voice.chunkPlan.formatPlan (not specifying formattersEnabled defaults to all formatters being enabled). See all available formatters in the FormatPlan.formattersEnabled reference.

  2. Deepgram Keyterm Prompting: The keyterm array in DeepgramTranscriber implements Deepgram’s Keyterm Prompting technology, boosting recall for domain-specific terminology. Compared to the existing keywords field:

Featurekeywordskeyterm
Recall Boost15-20%Up to 90%
FormatWord:WeightRaw phrases
Use CaseGeneral vocabularyCritical terms

You should reserve keyterm for compliance-sensitive terms like medical codes while using keywords for proper nouns / brand names.

  1. Subscription usage tracking improvements: The minutesUsedNextResetAt timestamp now appears in all subscription tiers (not just enterprise), exposed at subscription.minutesUsedNextResetAt for predictable billing cycle integration. Combine with existing minutesUsed and minutesIncluded metrics to build custom usage dashboards, regardless of subscription tier.

  2. Neuphonic voice synthesis: You can now configure Neuphonic as a voice provider with Assistant.voice[provider="neuphonic"]. Handle appropriate errors with pipeline-error-neuphonic-voice-failed. Test latency thresholds as Neuphonic requires 200ms additional processing time compared to ElevenLabs.

Neuphonic Voice Synthesis
Neuphonic Voice Synthesis
  1. Support for pre-transfer announcements in ClientInboundMessageTransfer: The content field in ClientInboundMessageTransfer now supports pre-transfer announcements (“Connecting you to billing…”) before SIP/number routing. Implement via WebSocket messages using type: “transfer” with destination object.

Deprecation Notice

OrgWithOrgUser is now deprecated, and impacts endpoints returning organization-user composites. This has been replaced with separate Org and User schemas for better clarity and consistency.