Get the (almost) daily changelog

Breaking Changes & API Cleanup

  1. Legacy Endpoint Removal: The following deprecated endpoints have been removed as part of our API modernization effort:

    • /logs - Use call artifacts and monitoring instead
    • /workflow/{id} - Access workflows through the main workflow endpoints
    • /test-suite and related paths - Replaced by the new evaluation system
    • /knowledge-base and related paths - Integrated into model configurations
  2. Knowledge Base Architecture Change: The knowledgeBaseId property has been removed from all model configurations. This affects:

  3. Transcriber Property Deprecation: AssemblyAITranscriber.wordFinalizationMaxWaitTime and FallbackAssemblyAITranscriber.wordFinalizationMaxWaitTime are now deprecated:

    • Use smart endpointing plans for better speech timing control
    • More precise conversation flow management
    • Enhanced end-of-turn detection capabilities
  4. Schema Path Cleanup: Removed numerous unused schema paths from model configurations to simplify the API structure and improve performance. This cleanup affects internal schema references but doesn’t impact your existing integrations.

  5. New v2 API: We are introducing a new API version v2. These changes are part of our ongoing effort to:

    • Simplify the API structure for better developer experience
    • Remove redundant and deprecated functionality
    • Complete the transition to new evaluation and compliance systems
    • Improve API performance and maintainability

For details on the new features that replace these deprecated endpoints, see our recent changelog entries:

If you’re currently using any of the removed endpoints or properties, you must migrate to the new alternatives before this release. Contact support if you need assistance with migration strategies.

Migration Guide

Logging & Monitoring

Replace /logs endpoint usage with call artifacts, monitoring plans, and end-of-call reports for comprehensive logging.

Testing Framework

Migrate from test-suite endpoints to the new evaluation system with mock conversations and comprehensive result tracking.

Knowledge Base

Update model configurations to use the integrated knowledge base system instead of separate knowledgeBaseId references.

Speech Timing

Replace deprecated transcriber timing properties with smart endpointing plans for better conversation flow control.

Removed Endpoints

The following endpoints are no longer available:

  • GET /logs - Use call artifacts instead
  • GET /workflow/{id} - Use main workflow endpoints
  • GET /test-suite, POST /test-suite - Use evaluation endpoints
  • GET /test-suite/{id}, PUT /test-suite/{id}, DELETE /test-suite/{id} - Use evaluation management
  • POST /test-suite/{testSuiteId}/run - Use evaluation runs
  • GET /knowledge-base, POST /knowledge-base - Integrated into model configurations
  • All related nested endpoints and operations

See Also:


Evaluation Execution & Results Processing

  1. Evaluation Execution Engine: Run comprehensive assistant evaluations with EvalRun and CreateEvalRunDTO. Execute your mock conversations against live assistants and squads to validate performance and behavior in controlled environments.

  2. Multiple Evaluation Models: Choose from various AI models for LLM-as-a-judge evaluation:

    • EvalOpenAIModel: GPT models including GPT-4.1, o1-mini, o3, and regional variants
    • EvalAnthropicModel: Claude models with optional thinking features for complex evaluations
    • EvalGoogleModel: Gemini models from 1.0 Pro to 2.5 Pro for diverse evaluation needs
    • EvalGroqModel: High-speed inference models including Llama and custom options
    • EvalCustomModel: Your own evaluation models with custom endpoints
  3. Evaluation Results: Comprehensive result tracking with EvalRunResult:

    • status: Pass/fail evaluation outcomes
    • messages: Complete conversation transcript from the evaluation
    • startedAt and endedAt: Precise timing information for performance analysis
  4. Target Flexibility: Run evaluations against different targets:

  5. Evaluation Status Tracking: Monitor evaluation progress with detailed status information:

    • running: Evaluation in progress
    • ended: Evaluation completed
    • queued: Evaluation waiting to start
    • Detailed endedReason including success, error, timeout, and cancellation states
  6. Judge Configuration: Optimize evaluation accuracy with model-specific settings:

    • maxTokens: Recommended 50-10000 tokens (1 token for simple pass/fail responses)
    • temperature: 0-0.3 recommended for LLM-as-a-judge to reduce hallucinations

For LLM-as-a-judge evaluations, the judge model must respond with exactly “pass” or “fail”. Design your evaluation prompts to ensure clear, deterministic responses.

Evaluation Capabilities

Multi-Model Support

Choose from OpenAI, Anthropic, Google, Groq, or custom models for evaluation, matching your quality and performance requirements.

Comprehensive Results

Detailed pass/fail results with complete conversation transcripts and timing information for thorough analysis.

Flexible Targets

Test individual assistants or entire squads with optional configuration overrides for comprehensive validation.

Status Monitoring

Real-time evaluation status tracking with detailed reason codes for failures, timeouts, and cancellations.


Voicemail Detection & Handling Improvements

  1. Enhanced Beep Detection: Improve voicemail detection accuracy with CreateVoicemailToolDTO.beepDetectionEnabled specifically for Twilio-based calls. This feature detects the characteristic beep sound that indicates voicemail recording has started.

  2. Workflow Voicemail Integration: Configure comprehensive voicemail handling in workflows with enhanced message and detection capabilities:

  3. Assistant Voicemail Enhancement: Improved voicemail handling in assistant configurations with Assistant.voicemailMessage and Assistant.voicemailDetection for consistent behavior across all conversation types.

  4. Multiple Detection Methods: Choose from various voicemail detection providers:

  5. Beep Detection for Call Flows: The new beep detection capability works specifically with Twilio transport, providing reliable voicemail identification when traditional detection methods may not be sufficient.

  6. Voicemail Tool Configuration: Enhanced tool rejection and messaging capabilities ensure appropriate handling when voicemail is detected, with configurable responses based on your business requirements.

Beep detection is currently available only for Twilio-based calls. If you’re using other providers, consider combining multiple detection methods for better accuracy.

Voicemail Management Features

Multi-Provider Detection

Support for Google, OpenAI, Twilio, and Vapi detection methods, allowing you to choose the best option for your use case.

Beep Detection

Advanced audio analysis to detect voicemail beeps on Twilio calls for more reliable voicemail identification.

Custom Messaging

Configure personalized voicemail messages up to 1000 characters for better user experience and brand consistency.

Workflow Integration

Comprehensive voicemail handling throughout workflow nodes with consistent configuration across conversation flows.


Advanced Analytics & Variable Grouping

  1. Variable Value Analytics: Gain deeper insights into your assistant performance with AnalyticsQuery.groupByVariableValue. Group analytics data by specific variable values extracted during calls for granular performance analysis.

  2. Enhanced Grouping Options: Use VariableValueGroupBy to specify custom grouping criteria:

    • key: The variable value key to group by (up to 100 characters)
    • Combine with existing grouping options like assistantId, endedReason, and status
  3. Multi-Dimensional Analysis: Create complex analytics queries by combining traditional grouping fields with variable values:

    • Group by assistant performance AND custom business metrics
    • Analyze conversation outcomes by extracted data points
    • Track success rates across different variable value segments
  4. Advanced Query Capabilities: Enhanced AnalyticsQuery functionality enables sophisticated data analysis:

    • Multiple grouping dimensions for comprehensive insights
    • Variable-based segmentation for business intelligence
    • Custom metric tracking through extracted call variables
  5. Business Intelligence Integration: Connect your call data to business outcomes by grouping analytics on:

    • Customer satisfaction scores extracted from calls
    • Product interest levels determined during conversations
    • Lead qualification status gathered through assistant interactions
    • Custom KPIs specific to your business logic

Variable values are extracted during calls using tool response schemas and aliases. Set up variable extraction in your tools to enable powerful analytics grouping based on conversation outcomes.

Analytics Enhancements

Custom Metrics

Group analytics by any variable extracted during calls, enabling business-specific performance insights and KPI tracking.

Multi-Dimensional Analysis

Combine traditional call metrics with custom variable grouping for comprehensive conversation analysis.

Business Intelligence

Connect call performance to business outcomes through variable-based analytics and custom grouping options.

Flexible Reporting

Create detailed reports by grouping on extracted conversation data like satisfaction scores, intent categories, or custom business metrics.


Chat Transport & SMS Integration

  1. Twilio SMS Transport: Send chat responses directly via SMS using TwilioSMSChatTransport in CreateChatDTO.transport. This enables programmatic SMS conversations with your voice assistants, bridging the gap between voice and text communication.

  2. SMS Session Management: Create new sessions automatically when using SMS transport by providing:

    • customer: Customer information for SMS delivery
    • phoneNumberId: SMS-enabled phone number from your organization
    • Automatic session creation when both fields are provided
  3. LLM-Generated vs Direct SMS: Control message processing with TwilioSMSChatTransport.useLLMGeneratedMessageForOutbound:

    • true (default): Input processed by assistant for intelligent responses
    • false: Direct message forwarding without LLM processing for notifications and alerts
  4. Enhanced Chat Creation: CreateChatDTO now supports sophisticated session management:

    • transport: SMS delivery configuration
    • sessionId: Use existing session data
    • Mutual exclusivity between sessionId and transport fields for clear session boundaries
  5. OpenAI Responses Integration: Streamlined chat processing with OpenAIResponsesRequest supporting the same transport and squad integration features for consistent API experience.

  6. Cross-Platform Continuity: Seamlessly transition between voice calls and SMS conversations within the same session, maintaining context and conversation history across communication channels.

SMS transport requires SMS-enabled phone numbers in your organization. The phone number must support SMS functionality and belong to your account for successful message delivery.

SMS Communication Features

Bidirectional SMS

Send and receive SMS messages through your voice assistant, enabling text-based interactions alongside voice conversations.

Smart Processing

Choose between AI-processed responses and direct message forwarding based on your use case requirements.

Session Continuity

Maintain conversation context across SMS and voice interactions within unified sessions for seamless user experiences.

Automated Management

Automatic session creation and management when using transport fields, simplifying SMS conversation setup.


API Versioning & Infrastructure Updates

  1. API Version 2 Introduction: Access enhanced functionality through new versioned endpoints while maintaining full backward compatibility:

    • /v2/call: Enhanced call management with new features and improved response formats
    • /v2/phone-number: Advanced phone number management with extended capabilities
  2. Enhanced Pagination: Improved pagination controls across all endpoints with PaginationMeta enhancements:

    • createdAtGe and createdAtLe: Date range filtering for creation timestamps
    • Better sorting and filtering options for large datasets
    • Enhanced metadata for pagination state management
  3. Workflow Message Configuration: Customize voicemail handling in workflows with CreateWorkflowDTO.voicemailMessage and CreateWorkflowDTO.voicemailDetection for comprehensive call flow management.

  4. Credential Integration: Seamless credential management across all workflow and assistant configurations with enhanced credentials.items.discriminator.mapping.custom-credential support.

  5. Transport Infrastructure: Foundation for advanced communication channels with improved transport configuration and management capabilities.

Version 2 endpoints provide enhanced features while v1 endpoints remain fully functional. Migrate to v2 when you need access to new capabilities or improved performance characteristics.

Infrastructure Improvements

Backward Compatibility

Existing v1 endpoints continue to work unchanged, ensuring smooth transitions and zero downtime for existing integrations.

Enhanced Filtering

Improved date range filtering and pagination controls for better data management and API performance.

Workflow Integration

Enhanced workflow configuration with better voicemail handling and credential management throughout the call flow.

Future-Ready Architecture

Foundation for advanced features and capabilities that will be built on the v2 API structure.


Squad Management & Session Enhancement

  1. Squad-Based Sessions: Organize your assistants into collaborative teams with Session.squad and Session.squadId. Sessions can now be associated with squads for team-based conversation management and coordinated assistant behavior.

  2. Squad Chat Integration: Enable squad-based chat conversations using Chat.squad and Chat.squadId. This allows multiple assistants to participate in or be aware of chat contexts for more sophisticated conversation handling.

  3. Enhanced Session Creation: Create squad-enabled sessions with CreateSessionDTO.squad and CreateSessionDTO.squadId, enabling persistent conversation contexts across multiple assistants and interaction types.

  4. Chat Management by Squad: Filter and organize chats by squad membership using GetChatPaginatedDTO.squadId for better conversation management and team-based analytics.

  5. Session Management by Squad: Query sessions by squad association with GetSessionPaginatedDTO.squadId, providing team-based session organization and management capabilities.

  6. Full Message History: Control conversation context retention with ArtifactPlan.fullMessageHistoryEnabled. When enabled, artifacts contain complete message history even after handoff context engineering, preserving full conversation flow for analysis.

  7. Transfer Records: Track warm transfer details with Artifact.transfers, providing comprehensive records of transfer destinations, transcripts, and status information for multi-assistant conversations.

Squad management enables sophisticated multi-assistant workflows where different specialists can handle different parts of a conversation while maintaining shared context and coordination.

Team Collaboration Features

Multi-Assistant Coordination

Enable multiple assistants to work together within squads for specialized conversation handling and seamless handoffs.

Persistent Context

Maintain conversation context across squad members and session boundaries for continuous conversation experiences.

Team Analytics

Filter conversations, sessions, and analytics by squad membership for team-based performance insights and management.

Complete Audit Trails

Track all transfers and handoffs with detailed records including destinations, transcripts, and status information.


Voice Enhancements & Minimax Improvements

  1. Minimax Voice Language Support: Enhance multilingual conversations with MinimaxVoice.languageBoost. Support for 40+ languages including:

    • Chinese and Chinese,Yue for Mandarin and Cantonese
    • English, Spanish, French, German, Japanese, Korean
    • Regional variants and specialized languages like Arabic, Hindi, Thai
    • auto mode for automatic language detection
  2. Text Normalization: Improve number reading and formatting with MinimaxVoice.textNormalizationEnabled. When enabled, spoken numbers, dates, and formatted text are properly pronounced for natural-sounding conversations.

  3. Enhanced Voice Caching: Voice responses are now cached by default with MinimaxVoice.cachingEnabled set to true, reducing latency for repeated phrases and improving overall conversation performance.

  4. Fallback Voice Configuration: Ensure conversation continuity with FallbackMinimaxVoice featuring the same language boost and text normalization capabilities as the primary voice configuration.

  5. Speaker Labeling: Track multiple speakers in conversations with BotMessage.speakerLabel, providing stable speaker identification (e.g., “Speaker 1”) for better conversation analysis and diarization.

  6. Voice Region Support: Choose optimal performance regions with Minimax’s worldwide (default) or china regional settings for better latency and compliance with local regulations.

Language boost settings help the text-to-speech model better understand context and pronunciation for specific languages, resulting in more natural and accurate voice synthesis.

Voice Quality Features

Multilingual Support

Support for 40+ languages with automatic detection and language-specific optimizations for natural pronunciation.

Smart Text Processing

Intelligent normalization of numbers, dates, and formatted text for natural-sounding speech synthesis.

Performance Optimization

Voice caching reduces latency for common phrases, while regional settings optimize for local performance.

Conversation Tracking

Speaker labeling and diarization support for multi-participant conversation analysis and management.


Enhanced Transcription Features & Speech Processing

  1. Gladia Transcription Enhancements: Improve transcription accuracy and performance with new GladiaTranscriber features:

    • region: Choose between us-west and eu-west for optimal latency and data residency compliance
    • receivePartialTranscripts: Enable low-latency streaming transcription for real-time conversation flow
    • Enhanced language detection with support for both single and multiple language modes
  2. Advanced Deepgram Controls: Fine-tune speech recognition with enhanced DeepgramTranscriber settings:

    • eotThreshold: End-of-turn detection threshold for precise conversation boundaries (e.g., 0.7)
    • eotTimeoutMs: Maximum wait time for end-of-turn detection in milliseconds (e.g., 5000ms)
    • eagerEotThreshold: Early end-of-turn detection for responsive conversations (e.g., 0.3)
  3. AssemblyAI Keyterms Enhancement: Boost recognition accuracy for critical terms with AssemblyAITranscriber.keytermsPrompt:

    • Support for up to 100 keyterms, each up to 50 characters
    • Improved recognition for specific words and phrases
    • Additional cost: $0.04/hour when enabled
  4. Speechmatics Custom Vocabulary: Enhance recognition accuracy with SpeechmaticsCustomVocabularyItem:

    • content: The word or phrase to add (e.g., “Speechmatics”)
    • soundsLike: Alternative phonetic representations (e.g., [“speech mattix”]) for better pronunciation handling
  5. Word-Level Confidence: Access detailed transcription confidence data with CustomLLMModel.wordLevelConfidenceEnabled, providing word-by-word accuracy metrics for quality assessment and debugging.

  6. Enhanced Message Metadata: Store transcription confidence and other metadata in UserMessage.metadata, enabling detailed analysis of transcription quality and user speech patterns.

AssemblyAITranscriber.wordFinalizationMaxWaitTime is now deprecated. Use the new smart endpointing plans for better speech timing control. The deprecated property will be removed in a future release.

Transcription Improvements

Regional Processing

Choose optimal transcription regions with Gladia’s us-west and eu-west options for reduced latency and compliance.

Real-Time Streaming

Enable partial transcripts for immediate response processing, reducing perceived latency in conversations.

Advanced Speech Detection

Fine-tune end-of-turn detection with configurable thresholds and timeouts for natural conversation flow.

Custom Vocabulary

Improve accuracy for domain-specific terms, company names, and technical jargon with enhanced vocabulary support.


Evaluation System Foundation

  1. Evaluation Framework: You can now systematically test your Vapi voice assistants with the new Eval system. Create comprehensive test scenarios to validate assistant behavior, conversation flow, and tool usage through mock conversations.

  2. Mock Conversation Builder: Design test conversations using Eval.messages with support for multiple message types:

  3. Evaluation Types: Currently focused on chat.mockConversation type evaluations, with the framework designed to support additional evaluation methods in future releases.

  4. Evaluation Management: Organize your tests with CreateEvalDTO and UpdateEvalDTO:

    • name: Descriptive names up to 80 characters (e.g., “Customer Support Flow Validation”)
    • description: Detailed descriptions up to 500 characters explaining the test purpose
    • messages: The complete mock conversation flow
  5. Evaluation Endpoints: Access your evaluations through the new /eval endpoint family:

    • GET /eval: List all evaluations with pagination support
    • POST /eval: Create new evaluations
    • GET /eval/{id}: Retrieve specific evaluation details
    • PUT /eval/{id}: Update existing evaluations
  6. Judge Plan Architecture: Define how assistant responses are validated using AssistantMessageJudgePlan with three evaluation methods:

This is the foundation release for the evaluation system. Evaluation execution and results processing will be available in upcoming releases. Start designing your test scenarios now to be ready for full evaluation capabilities.

Testing Capabilities

Mock Conversations

Create realistic test scenarios with user messages, system prompts, and expected assistant responses for comprehensive flow validation.

Tool Call Testing

Validate that your assistant calls the right tools with correct parameters using ChatEvalAssistantMessageMockToolCall.

Flexible Validation

Choose from exact matching, regex patterns, or AI-powered evaluation to suit different testing needs and complexity levels.

Evaluation Organization

Organize tests with descriptive names and detailed documentation to maintain clear testing workflows across your team.