- Per-Artifact Storage Routing in Artifact Plans: You can now override artifact storage behavior per assistant/call for SIP packet capture (PCAP), logging, and call recording artifacts:
Assistant.artifactPlan.pcapUseCustomStorageEnabled
(default true): Use custom storage for SIP packet capture, which are stored inAssistant.artifactPlan.pcapUrl
.Assistant.artifactPlan.loggingUseCustomStorageEnabled
(default true): Determines whether to use your custom storage (S3 or GCP) for call logs when storage credentials are configured; set to false to store logs on Vapi’s storage for this assistant, even if custom storage is set globally.Assistant.artifactPlan.recordingUseCustomStorageEnabled
(default true): Determines whether to use your custom storage (S3 or GCP) for call recordings when storage credentials are configured; set to false to store recordings on Vapi’s storage for this assistant, even if custom storage is set globally.
- End AI call transfers after set timeout period: You can now configure AI-managed transfers with a Transfer Assistant to automatically end the call after a specified period of silence with
silenceTimeoutSeconds
(default 30 seconds). This helps prevent idle calls from lingering and saves costs.
- Enhanced Tool Retry Logic with Backoff Plans: You can now use
Assistant.hooks.do[type=tool].tool.backoffPlan
andAssistant.hooks.do[type=tool].tool.server.backoffPlan
to configure retry behavior for tool calls. Options include:
fixed
backoff (default): Consistent delay between retries.exponential
backoff: Increasing delays for subsequent retries- Configurable retry limits: Set
maxRetries
(0-10, default: 0) - Flexible timing: Adjust
baseDelaySeconds
(0-10 seconds) - Smart status code handling: Exclude specific HTTP status codes from retry attempts.
-
New Structured Output Endpoints: You can now use new APIs for structured outputs to define, extract, and manage structured data from conversations.
-
Configure Structured Output Resources: You can now define reusable structured data extraction templates, including:
- Custom JSON Schema: Specify the exact structure and validation rules for extracted data using full JSON Schema support (objects, arrays, enums, validation constraints, and more).
- Model Selection: Choose the LLM (OpenAI, Anthropic, Google, or custom) for extraction, or provide custom system/user prompts with Liquid templating for advanced scenarios.
- Context Linking: Link structured outputs to specific workflows or assistants for context-aware extraction.
- Metadata: Track creation/update timestamps, org linkage, and provide rich descriptions for each structured output.
- Assistant Transfer Improvements: You can now include an optional
name
property to better identify and manage your transfer assistants.
Voicemail Detection Enhancements
- Voicemail Detection Enhancements: You can now configure voicemail detection across providers with
Assistant.voicemailDetection
, for example with Vapi, Google, and OpenAI.Each plan now supports atype
property to select between:
audio
: Native audio model detection (default)transcript
: ASR/transcript-based detection
- Fine-tuned control: Under each voicemail detection plan, you can configure backoff plans and beep detection timing with
Assistant.voicemailDetection["yourVoicemailDetectionPlan"].beepMaxAwaitSeconds
for improved voicemail handling in automated calls.
- Enhanced Artifact Plans: All
Artifact Plans
now support the following properties:
loggingEnabled
- Toggle to enable call logsloggingPath
- Custom path for call log uploadsstructuredOutputs
- Toggle for structured output extraction
-
Enhanced Artifact Management: You can now extract structured data during calls with the new
structuredOutputs
property inArtifact
. -
Improved Cost Analysis: You can now view detailed call costs with new fields in
Analysis Cost Breakdown
:
structuredOutput
- Cost for structured output evaluationstructuredOutputPromptTokens
- Prompt tokens for structured outputstructuredOutputCompletionTokens
- Completion tokens for structured output
- Phone Number Hooks: You can now configure hooks for call ending events with the new
Call Ending hook for phone numbers
and exclude events to exclude from the hook withthe relevant filter
- Handoff Tool and Dynamic Agent Routing: You can now hand off conversations when building multi-agent systems with
Assistant.model.tools[type=handoff]
. Supported destinations include:
- Assistant Destinations: Directly hand off to a specific assistant by assistantId or assistantName.
- Dynamic Destinations: Route handoffs dynamically via a webhook to your server, which can determine the destination assistant in real time. Custom parameters such as customer intent, sentiment, or area code can be passed to the webhook for advanced routing logic.
- Multiple Destinations: Support for both single handoff destination per tool with multiple tools (recommended for OpenAI) and multiple handoff destinations with one tool (recommended for Anthropic).
You can read more about how to configure the handoff tool in the API Reference
- Context Engineering for Handoffs: When handing off a conversation, you can now control what context is passed to the next assistant:
- All Messages: Pass the entire conversation history. Refer to Context Engineering Plan All
- Last N Messages: Pass only the most recent N messages. Refer to Context Engineering Plan LastNMessages
- None: Pass no prior context. Refer to Context Engineering Plan None This gives you fine-grained control over privacy, relevance, and prompt size during agent transitions.
-
Message Metadata: Tool Messages, Assistant Messages, and Developer Messages objects now support an optional metadata field, allowing you to attach arbitrary metadata to messages for downstream processing or analytics.
-
Pagination Meta Enhancement: You can now reference
itemsBeyondRetention
boolean in paginated responses to indicate if additional items exist beyond the retention window.
New: Call Metrics & Artifact Improvements
You can now access detailed call performance metrics and structured output IDs directly from your call artifacts.
Call.artifact.performanceMetrics.turnLatencies
Call.artifact.performanceMetrics.modelLatencyAverage
Call.artifact.performanceMetrics.voiceLatencyAverage
Call.artifact.performanceMetrics.transcriberLatencyAverage
Call.artifact.performanceMetrics.endpointingLatencyAverage
Call.artifact.performanceMetrics.turnLatencyAverage
During call: Access array of output IDs
Call.artifactPlan.structuredOutputIds
After call: Extracted outputs are stored here
Call.artifact.structuredOutputs
These improvements help you monitor, debug, and analyze your calls with greater detail.
New: Smarter Conditions & Security Filters
- New Condition & Filter Types: You can now use the following new condition and filter types to build more robust rejection plans and security filter plans:
- MessageTarget: Target specific messages by role and position for conditions using
Assistant.hooks.do[type=tool].tool.rejectionPlan.conditions[type=regex].target
. - GroupCondition: Combine multiple conditions using AND/OR logic, with support for recursive nesting using
Assistant.hooks.do[type=tool].tool.rejectionPlan.conditions[type=group]
. - RegexCondition: Flexible pattern matching, with full support for JavaScript regex and negation using
Assistant.hooks.do[type=tool].tool.rejectionPlan.conditions[type=regex]
. - LiquidCondition: Use Liquid templates for complex, context-aware logic using
Assistant.hooks.do[type=tool].tool.rejectionPlan.conditions[type=liquid]
. - Security Filters: New filter types for RCE, XSS, SSRF, SQL injection, prompt injection, and regex-based filtering using
Assistant.compliancePlan.securityFilterPlan.filters
.
- Tool Rejection Plans: You can now use
Assistant.hooks.do[type=tool].tool.rejectionPlan
in all tool calls to prevent accidental tool execution, enforce confirmation steps, and build more robust conversation flows. This helps you to define complex logic for when a tool call should be rejected, enhancing both safety and call experience. Rejection plans can be built using regex conditions, Liquid templates, or logical groups (AND/OR). For example, you can prevent anendCall
tool from executing unless the user says goodbye, or block a transfer if the user is actually asking a question.
Example:
- Security Filter Plans for Transcripts and Messages: You can now use
Assistant.compliancePlan.securityFilterPlan
to define how transcripts and messages are filtered against threats like SQL injection, XSS, prompt injection, and more. Choose betweensanitize
,reject
, orreplace
when threats are detected, and specify custom replacement text. User messages and transcript objects now include:
isFiltered
: Indicates if content was filtered for security.detectedThreats
: Lists detected threats.originalMessage
/originalTranscript
: Preserves original content if filtering occurred.
🎤 New Gladia Transcription Provider Support
-
Custom vocabulary support: Enable a custom vocabulary with
Gladia
usingAssistant.transcriber[provider="GladiaTranscriber"].customVocabularyEnabled
. You can also specify simple strings or detailed objects with fields for value, language, intensity, and alternative pronunciations usingAssistant.transcriber[provider="GladiaTranscriber"].customVocabularyConfig
- letting you fine-tune recognition of domain-specific terms. -
Endpointing & Speech Threshold: Configure endpointing time (wait time before considering speech ended) and speech sensitivity, enabling more accurate and responsive transcription with
Assistant.transcriber[provider="GladiaTranscriber"].endpointing
andAssistant.transcriber[provider="GladiaTranscriber"].speechThreshold
. -
Prosody & Audio Enhancer: Optionally enable prosody (for transcribing non-verbal cues like laughter) and audio enhancement for improved accuracy with
Assistant.transcriber[provider="GladiaTranscriber"].prosodyEnabled
andAssistant.transcriber[provider="GladiaTranscriber"].audioEnhancerEnabled
. -
Flexible Language Detection: Choose between manual and automatic language detection modes with
Assistant.transcriber[provider="GladiaTranscriber"].languageDetectionMode
. -
Confidence Thresholds & Hints: Discard low-confidence transcripts and provide context hints for improved accuracy with
Assistant.transcriber[provider="GladiaTranscriber"].confidenceThreshold
andAssistant.transcriber[provider="GladiaTranscriber"].hints
.
💳 Subscription Updates
Role-based access control (RBAC):
Enable RBAC for your subscription using Subscription.rbacEnabled
.
Retention settings:
Configure how long calls and chats are stored with Subscription.callRetentionDays
and Subscription.chatRetentionDays
.
Reset frequency:
Set how often included minutes reset using Subscription.minutesIncludedResetFrequency
.