Handoff tool

The handoff tool enables seamless call transfers between assistants in a multi-agent system. This guide covers all configuration patterns, destination types, context management, and advanced features.

Overview

The handoff tool transfers calls between assistants during a conversation. You can:

Transfer to a specific assistant by ID or by name (within a squad)
Transfer to an entire squad with a designated entry assistant
Support multiple destination options for the AI to choose from
Determine the destination dynamically at runtime via a webhook
Control what conversation history the next assistant receives
Extract structured variables from the conversation for downstream use
Configure spoken messages for each phase of the handoff
Reject handoff attempts based on conversation state

System prompt best practices

When using the handoff tool, add this to your system prompt for optimal agent coordination (adapted from the OpenAI Agents Handoff Prompt):

1 # System context
2 
3 You are part of a multi-agent system designed to make agent coordination and execution easy.
4 Agents uses two primary abstraction: **Agents** and **Handoffs**. An agent encompasses
5 instructions and tools and can hand off a conversation to another agent when appropriate.
6 Handoffs are achieved by calling a handoff function, generally named `handoff_to_<agent_name>`.
7 Handoffs between agents are handled seamlessly in the background; do not mention or draw
8 attention to these handoffs in your conversation with the user.
9 
10 # Agent context
11 
12 {put your agent system prompt here}

Basic configuration

Single destination handoff

Using assistant ID

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "assistant",
8           "assistantId": "03e11cfe-4528-4243-a43d-6aded66ab7ba",
9           "description": "customer wants to speak with technical support",
10           "contextEngineeringPlan": {
11             "type": "all"
12           }
13         }
14       ]
15     }
16   ]
17 }

Using assistant name (for squad members)

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "assistant",
8           "assistantName": "TechnicalSupportAgent",
9           "description": "customer needs technical assistance",
10           "contextEngineeringPlan": {
11             "type": "all"
12           }
13         }
14       ]
15     }
16   ]
17 }

Each assistant destination also supports assistantOverrides to override settings on the destination assistant, and an inline assistant property to create a transient assistant without saving it first. See the API reference for all available properties.

Multiple destinations

Multiple tools pattern (OpenAI recommended)

Best for OpenAI models — creates separate tool definitions for each destination:

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "assistant",
8           "assistantId": "sales-assistant-123",
9           "description": "customer wants to learn about pricing or make a purchase",
10           "contextEngineeringPlan": {
11             "type": "all"
12           }
13         }
14       ]
15     },
16     {
17       "type": "handoff",
18       "destinations": [
19         {
20           "type": "assistant",
21           "assistantId": "support-assistant-456",
22           "description": "customer needs help with an existing product or service",
23           "contextEngineeringPlan": {
24             "type": "all"
25           }
26         }
27       ]
28     },
29     {
30       "type": "handoff",
31       "destinations": [
32         {
33           "type": "assistant",
34           "assistantId": "billing-assistant-789",
35           "description": "customer has questions about invoices, payments, or refunds",
36           "contextEngineeringPlan": {
37             "type": "lastNMessages",
38             "maxMessages": 5
39           }
40         }
41       ]
42     }
43   ]
44 }

Single tool pattern (Anthropic recommended)

Best for Anthropic models — single tool with multiple destination options:

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "assistant",
8           "assistantId": "03e11cfe-4528-4243-a43d-6aded66ab7ba",
9           "description": "customer wants to learn about pricing or make a purchase"
10         },
11         {
12           "type": "assistant",
13           "assistantName": "support-assistant",
14           "description": "customer needs help with an existing product or service"
15         },
16         {
17           "type": "assistant",
18           "assistantName": "billing-assistant",
19           "description": "customer has questions about invoices, payments, or refunds"
20         }
21       ]
22     }
23   ]
24 }

Dynamic handoffs

Basic dynamic handoff

The destination is determined at runtime via the handoff-destination-request webhook:

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "dynamic",
8           "server": {
9             "url": "https://api.example.com/determine-handoff-destination",
10             "headers": {
11               "Authorization": "Bearer YOUR_API_KEY"
12             }
13           }
14         }
15       ]
16     }
17   ]
18 }

Your server must respond with a single destination. You can return an assistantId, assistantName (if using squads), or a transient assistant. For example:

1 {
2   "destination": {
3     "type": "assistant",
4     "assistantId": "assistant-id",
5     "variableExtractionPlan": {
6       "schema": {
7         "type": "object",
8         "properties": {
9           "name": {
10             "type": "string",
11             "description": "Name of the customer"
12           }
13         },
14         "required": ["name"]
15       }
16     },
17     "contextEngineeringPlan": {
18       "type": "none"
19     }
20   }
21 }

If the handoff should not execute, either respond with an empty destination, or provide a custom error. The custom error is added to the message history.

1 {
2   "error": "Example custom error message"
3 }

Dynamic handoff with custom parameters

Pass additional context to your webhook for intelligent routing:

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "dynamic",
8           "server": {
9             "url": "https://api.example.com/intelligent-routing"
10           }
11         }
12       ],
13       "function": {
14         "name": "handoff_with_context",
15         "description": "Transfer the call to the most appropriate specialist",
16         "parameters": {
17           "type": "object",
18           "properties": {
19             "destination": {
20               "type": "string",
21               "description": "Use 'dynamic' to route to the best available agent",
22               "enum": ["dynamic"]
23             },
24             "customerAreaCode": {
25               "type": "number",
26               "description": "Customer's area code for regional routing"
27             },
28             "customerIntent": {
29               "type": "string",
30               "enum": ["new-customer", "existing-customer", "partner"],
31               "description": "Customer type for proper routing"
32             },
33             "customerSentiment": {
34               "type": "string",
35               "enum": ["positive", "negative", "neutral", "escalated"],
36               "description": "Current emotional state of the customer"
37             },
38             "issueCategory": {
39               "type": "string",
40               "enum": ["technical", "billing", "sales", "general"],
41               "description": "Primary category of the customer's issue"
42             },
43             "priority": {
44               "type": "string",
45               "enum": ["low", "medium", "high", "urgent"],
46               "description": "Urgency level of the request"
47             }
48           },
49           "required": ["destination", "customerIntent", "issueCategory"]
50         }
51       }
52     }
53   ]
54 }

Squad destinations

In addition to assistant and dynamic destinations, you can hand off a call to an entire squad. This transfers the caller into a new multi-agent system where the squad’s own routing logic takes over.

Using squad ID

Reference a saved squad by its ID:

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "squad",
8           "squadId": "your-squad-id",
9           "description": "customer needs specialized support from the enterprise team",
10           "entryAssistantName": "EnterpriseGreeter",
11           "contextEngineeringPlan": {
12             "type": "userAndAssistantMessages"
13           }
14         }
15       ]
16     }
17   ]
18 }

Using a transient squad

Define the squad inline without saving it first:

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "squad",
8           "squad": {
9             "members": [
10               {
11                 "assistantId": "greeter-assistant-id",
12                 "assistantDestinations": [
13                   {
14                     "type": "assistant",
15                     "assistantName": "SalesSpecialist",
16                     "description": "customer is interested in purchasing"
17                   }
18                 ]
19               },
20               {
21                 "assistantId": "sales-assistant-id"
22               }
23             ]
24           },
25           "entryAssistantName": "GreeterAssistant",
26           "description": "route customer to the sales squad"
27         }
28       ]
29     }
30   ]
31 }

Squad destination properties

Property	Type	Description
type	`"squad"`	Required. Identifies this as a squad destination.
squadId	string	The ID of a saved squad. Provide either `squadId` or `squad`.
squad	object	A transient squad definition. Provide either `squadId` or `squad`.
entryAssistantName	string	The name of the assistant to start with. If not provided, the first squad member is used.
description	string	Describes when the AI should choose this destination.
contextEngineeringPlan	object	Controls what conversation history transfers to the squad.
variableExtractionPlan	object	Extracts structured data from the conversation before handoff.
squadOverrides	object	Overrides applied to the squad configuration (maps to squad-level `membersOverrides`).

For the full schema, see the API reference.

Context engineering

Control what conversation history transfers to the next assistant or squad. Set contextEngineeringPlan on any destination.

All messages (default)

Transfers the entire conversation history:

1 {
2   "contextEngineeringPlan": {
3     "type": "all"
4   }
5 }

Last N messages

Transfers only the most recent N messages. Use this to limit context size for performance:

1 {
2   "contextEngineeringPlan": {
3     "type": "lastNMessages",
4     "maxMessages": 10
5   }
6 }

User and assistant messages only

Transfers only user and assistant messages, filtering out system messages, tool calls, and tool results. This gives the next assistant a clean view of the conversation without internal implementation details:

1 {
2   "contextEngineeringPlan": {
3     "type": "userAndAssistantMessages"
4   }
5 }

Use userAndAssistantMessages when the destination assistant does not need to see tool call history or system prompts from the previous assistant. This produces a cleaner context and reduces token usage.

No context

Starts the next assistant with a blank conversation:

1 {
2   "contextEngineeringPlan": {
3     "type": "none"
4   }
5 }

Variable extraction

Extract and pass structured data during handoff. Variables extracted by the handoff tool are available to all subsequent assistants in the conversation chain. When a handoff extracts a variable with the same name as an existing one, the new value replaces the previous value.

Extraction via `variableExtractionPlan` in destinations

This extraction method makes an OpenAI structured output request to extract variables. Use this when you have multiple destinations, each with different variables that need to be extracted.

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "assistant",
8           "assistantName": "order-processing-assistant",
9           "description": "customer is ready to place an order",
10           "variableExtractionPlan": {
11             "schema": {
12               "type": "object",
13               "properties": {
14                 "customerName": {
15                   "type": "string",
16                   "description": "Full name of the customer"
17                 },
18                 "email": {
19                   "type": "string",
20                   "format": "email",
21                   "description": "Customer's email address"
22                 },
23                 "productIds": {
24                   "type": "array",
25                   "items": {
26                     "type": "string"
27                   },
28                   "description": "List of product IDs customer wants to order"
29                 },
30                 "shippingAddress": {
31                   "type": "object",
32                   "properties": {
33                     "street": { "type": "string" },
34                     "city": { "type": "string" },
35                     "state": { "type": "string" },
36                     "zipCode": { "type": "string" }
37                   }
38                 }
39               },
40               "required": ["customerName", "productIds"]
41             }
42           }
43         }
44       ]
45     }
46   ]
47 }

Variable access patterns

Once extracted, variables are accessible using Liquid template syntax ({{variableName}}). The access pattern depends on the schema structure:

Schema type	Access pattern	Example
Simple property	`{{propertyName}}`	`{{customerName}}`
Nested object	`{{object.property}}`	`{{name.first}}`, `{{name.last}}`
Array item	`{{array[index]}}`	`{{zipCodes[0]}}`, `{{zipCodes[1]}}`
Array of objects	`{{array[index].property}}`	`{{people[0].name}}`, `{{people[0].age}}`
Nested array	`{{array[index].nestedArray[index]}}`	`{{people[0].zipCodes[1]}}`

Top-level object properties are extracted as direct global variables. For example, a schema with properties name and age produces {{name}} and {{age}} — not {{root.name}}.

Variable aliases

Use aliases to create additional variables derived from extracted values. Aliases support Liquid template syntax for transformations and compositions.

1 {
2   "variableExtractionPlan": {
3     "schema": {
4       "type": "object",
5       "properties": {
6         "firstName": {
7           "type": "string",
8           "description": "Customer's first name"
9         },
10         "lastName": {
11           "type": "string",
12           "description": "Customer's last name"
13         },
14         "company": {
15           "type": "string",
16           "description": "Customer's company name"
17         }
18       }
19     },
20     "aliases": [
21       {
22         "key": "fullName",
23         "value": "{{firstName}} {{lastName}}"
24       },
25       {
26         "key": "greeting",
27         "value": "Hello {{firstName}}, welcome to {{company}}!"
28       },
29       {
30         "key": "customerCity",
31         "value": "{{addresses[0].city}}"
32       }
33     ]
34   }
35 }

Each alias creates a new variable accessible as {{key}} during the call and stored in call.artifact.variableValues after the call. Alias keys must start with a letter and contain only letters, numbers, or underscores (max 40 characters).

Extraction via `tool.function`

You can also extract variables through the LLM tool call parameters (in addition to sending these parameters to your server in a handoff-destination-request for dynamic handoffs). Include the destination parameter with the assistant names or IDs in enum — Vapi uses this to determine where to hand off the call. The destination parameter itself is not extracted as a variable. Add destination and all other required variables to the schema’s required array.

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "destinations": [
6         {
7           "type": "assistant",
8           "assistantName": "order-processing-assistant",
9           "description": "customer is ready to place an order"
10         }
11       ],
12       "function": {
13         "name": "handoff_to_order_processing_assistant",
14         "parameters": {
15           "type": "object",
16           "properties": {
17             "destination": {
18               "type": "string",
19               "description": "The destination to handoff the call to.",
20               "enum": ["order-processing-assistant"]
21             },
22             "customerName": {
23               "type": "string",
24               "description": "Full name of the customer"
25             },
26             "email": {
27               "type": "string",
28               "format": "email",
29               "description": "Customer's email address"
30             },
31             "productIds": {
32               "type": "array",
33               "items": {
34                 "type": "string"
35               },
36               "description": "List of product IDs customer wants to order"
37             },
38             "shippingAddress": {
39               "type": "object",
40               "properties": {
41                 "street": { "type": "string" },
42                 "city": { "type": "string" },
43                 "state": { "type": "string" },
44                 "zipCode": { "type": "string" }
45               }
46             }
47           },
48           "required": ["destination", "customerName", "email"]
49         }
50       }
51     }
52   ]
53 }

Tool messages

Configure what the assistant says during each phase of the handoff. Add a messages array to the handoff tool to control the spoken responses.

Message types

Type	Trigger	Default behavior
`request-start`	Handoff begins executing	Says a random filler: “Hold on a sec”, “One moment”, etc.
`request-complete`	Handoff completes successfully	Model generates a response
`request-failed`	Handoff fails	Model generates a response
`request-response-delayed`	Server is slow or user speaks during processing	Says “Sorry, a few more seconds.”

Example configuration

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "messages": [
6         {
7           "type": "request-start",
8           "content": "Let me transfer you now. One moment please."
9         },
10         {
11           "type": "request-complete",
12           "content": "You're now connected. How can the next specialist help you?"
13         },
14         {
15           "type": "request-failed",
16           "content": "I'm sorry, I wasn't able to complete the transfer. Let me try to help you directly."
17         },
18         {
19           "type": "request-response-delayed",
20           "content": "Still working on the transfer, thank you for your patience.",
21           "timingMilliseconds": 3000
22         }
23       ],
24       "destinations": [
25         {
26           "type": "assistant",
27           "assistantId": "your-assistant-id",
28           "description": "transfer to specialist"
29         }
30       ]
31     }
32   ]
33 }

Message properties

request-start

content (string) — The text the assistant speaks when the handoff begins.
blocking (boolean, default: false) — When true, the tool call waits until the message finishes speaking before executing.
conditions (array) — Optional conditions that must match for this message to trigger.
contents (array) — Multilingual variants of the content. Overrides content when provided.

request-complete

content (string) — The text the assistant speaks when the handoff completes.
role ("assistant" | "system", default: "assistant") — When "assistant", the content is spoken aloud. When "system", the content is passed as a system message hint to the model.
endCallAfterSpokenEnabled (boolean, default: false) — When true, the call ends after this message is spoken.
conditions (array) — Optional conditions for triggering this message.
contents (array) — Multilingual variants.

request-failed

content (string) — The text the assistant speaks when the handoff fails.
endCallAfterSpokenEnabled (boolean, default: false) — When true, the call ends after this message.
conditions (array) — Optional conditions for triggering.
contents (array) — Multilingual variants.

request-response-delayed

content (string) — The text the assistant speaks when the handoff is taking longer than expected.
timingMilliseconds (number, 100-120000) — Milliseconds to wait before triggering this message.
conditions (array) — Optional conditions for triggering.
contents (array) — Multilingual variants.

For the full schema, see the API reference.

Rejection plan

Use rejectionPlan to prevent a handoff from executing based on conversation state. When all conditions in the plan match, the tool call is rejected and the rejection message is added to the conversation.

Regex condition

Match against message content using regular expressions:

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "rejectionPlan": {
6         "conditions": [
7           {
8             "type": "regex",
9             "regex": "(?i)\\b(cancel|stop|nevermind)\\b",
10             "target": {
11               "role": "user",
12               "position": -1
13             }
14           }
15         ]
16       },
17       "destinations": [
18         {
19           "type": "assistant",
20           "assistantId": "your-assistant-id",
21           "description": "transfer to billing"
22         }
23       ]
24     }
25   ]
26 }

This rejects the handoff if the user’s most recent message contains “cancel”, “stop”, or “nevermind” (case-insensitive).

Liquid condition

Use Liquid templates for more complex logic. The template must return exactly "true" or "false":

1 {
2   "rejectionPlan": {
3     "conditions": [
4       {
5         "type": "liquid",
6         "liquid": "{% assign userMsgs = messages | where: 'role', 'user' %}{% if userMsgs.size < 3 %}true{% else %}false{% endif %}"
7       }
8     ]
9   }
10 }

This rejects the handoff if fewer than 3 user messages exist in the conversation. Available Liquid variables include messages (array of recent messages), now (current timestamp), and any assistant variable values.

Group condition

Combine multiple conditions with AND or OR logic:

1 {
2   "rejectionPlan": {
3     "conditions": [
4       {
5         "type": "group",
6         "operator": "OR",
7         "conditions": [
8           {
9             "type": "regex",
10             "regex": "(?i)\\b(cancel|stop)\\b",
11             "target": { "role": "user" }
12           },
13           {
14             "type": "liquid",
15             "liquid": "{% assign userMsgs = messages | where: 'role', 'user' %}{% if userMsgs.size < 2 %}true{% else %}false{% endif %}"
16           }
17         ]
18       }
19     ]
20   }
21 }

By default, all top-level conditions in the conditions array use AND logic — all must match for the rejection to trigger. Use a group condition with operator: "OR" to reject when any single condition matches.

For the full schema, see the API reference.

Custom function definitions

Override the default function definition for more control. You can overwrite the function name for each tool to reference in the system prompt, or pass custom parameters in a dynamic handoff request.

1 {
2   "tools": [
3     {
4       "type": "handoff",
5       "function": {
6         "name": "handoff_to_department",
7         "description": "Transfer the customer to the appropriate department based on their needs. Only use when explicitly requested or when the current assistant cannot help.",
8         "parameters": {
9           "type": "object",
10           "properties": {
11             "destination": {
12               "type": "string",
13               "description": "Department to transfer to",
14               "enum": ["sales-team", "technical-support", "billing-department", "management"]
15             },
16             "reason": {
17               "type": "string",
18               "description": "Brief reason for the transfer"
19             },
20             "urgency": {
21               "type": "boolean",
22               "description": "Whether this is an urgent transfer"
23             }
24           },
25           "required": ["destination", "reason"]
26         }
27       },
28       "destinations": [
29         {
30           "type": "assistant",
31           "assistantId": "sales-team",
32           "description": "Sales inquiries and purchases"
33         },
34         {
35           "type": "assistant",
36           "assistantId": "technical-support",
37           "description": "Technical issues and support"
38         },
39         {
40           "type": "assistant",
41           "assistantId": "billing-department",
42           "description": "Billing and payment issues"
43         },
44         {
45           "type": "assistant",
46           "assistantId": "management",
47           "description": "Escalations and complaints"
48         }
49       ]
50     }
51   ]
52 }

Best practices

Clear descriptions: Write specific, actionable descriptions for each destination in your system prompt. Use tool.function.name to customize the name of the function to reference in your prompt.
Context management: Use lastNMessages or userAndAssistantMessages to limit context size for performance.
Model optimization: Use multiple tools for OpenAI, single tool for Anthropic.
Variable extraction: Extract key data before handoff to maintain context across assistants.
Tool messages: Add custom request-start messages to set caller expectations during transfers.
Testing: Test handoff scenarios thoroughly, including edge cases and rejection conditions.
Monitoring and analysis: Enable artifactPlan.fullMessageHistoryEnabled to capture the complete message history across all handoffs in your artifacts. See squad artifact behavior for details.

Troubleshooting

Ensure assistant IDs are valid and accessible
Verify webhook server URLs are reachable and return the proper format
Check that required parameters in custom functions match destinations
Monitor context size to avoid token limits
Test variable extraction schemas with sample data
Validate that assistant names exist in the same squad
Verify rejection plan conditions use correct regex syntax (remember to double-escape \\ in JSON)

Table of contents

Overview

System prompt best practices

Basic configuration

Single destination handoff

Using assistant ID

Using assistant name (for squad members)

Multiple destinations

Multiple tools pattern (OpenAI recommended)

Single tool pattern (Anthropic recommended)

Dynamic handoffs

Basic dynamic handoff

Dynamic handoff with custom parameters

Squad destinations

Using squad ID

Using a transient squad

Squad destination properties

Context engineering

All messages (default)

Last N messages

User and assistant messages only

No context

Variable extraction

Extraction via variableExtractionPlan in destinations

Variable access patterns

Variable aliases

Extraction via tool.function

Tool messages

Message types

Example configuration

Message properties

Rejection plan

Regex condition

Liquid condition

Group condition

Custom function definitions

Best practices

Troubleshooting

Extraction via `variableExtractionPlan` in destinations

Extraction via `tool.function`