> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.vapi.ai/llms.txt.
> For full documentation content, see https://docs.vapi.ai/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.vapi.ai/_mcp/server.

# Server events

All messages sent to your Server URL are `POST` requests with this body shape:

```json
{
  "message": {
    "type": "<server-message-type>",
    "call": { /* Call Object */ },
    /* other fields depending on type */
  }
}
```

Common metadata included on most events:

* `phoneNumber`, `timestamp`
* `artifact` (recording, transcript, messages, etc.)
* `assistant`, `customer`, `call`, `chat`

Most events are informational and do not require a response. Responses are only expected for these types sent to your Server URL:

* "assistant-request"
* "tool-calls"
* "transfer-destination-request"
* "knowledge-base-request"

Note: Some specialized messages like "voice-request" and "call.endpointing.request" are sent to their dedicated servers if configured (e.g. `assistant.voice.server.url`, `assistant.startSpeakingPlan.smartEndpointingPlan.server.url`).

### Function Calling (Tools)

Vapi supports OpenAI-style tool/function calling. Assistants can ping your server to perform actions.

Example assistant configuration (excerpt):

```json
{
  "model": {
    "provider": "openai",
    "model": "gpt-4o",
    "functions": [
      {
        "name": "sendEmail",
        "description": "Used to send an email to a client.",
        "parameters": {
          "type": "object",
          "properties": {
            "emailAddress": { "type": "string" },
            "message": { "type": "string" }
          },
          "required": ["emailAddress", "message"]
        }
      }
    ]
  }
}
```

When tools are triggered, your Server URL receives a `tool-calls` message:

```json
{
  "message": {
    "type": "tool-calls",
    "call": { /* Call Object */ },
    "toolWithToolCallList": [
      {
        "name": "sendEmail",
        "toolCall": { "id": "abc123", "parameters": { "emailAddress": "john@example.com", "message": "Hi!" } }
      }
    ],
    "toolCallList": [
      { "id": "abc123", "name": "sendEmail", "parameters": { "emailAddress": "john@example.com", "message": "Hi!" } }
    ]
  }
}
```

Respond with results for each tool call:

```json
{
  "results": [
    {
      "name": "sendEmail",
      "toolCallId": "abc123",
      "result": "{ \"status\": \"sent\" }"
    }
  ]
}
```

Optionally include a message to speak to the user while or after running the tool.

If a tool does not need a response immediately, you can design it to be asynchronous.

### Retrieving Assistants

For inbound phone calls, you can specify the assistant dynamically. If a PhoneNumber doesn't have an `assistantId`, Vapi may request one from your server:

```json
{
  "message": {
    "type": "assistant-request",
    "call": { /* Call Object */ }
  }
}
```

You must respond to the `assistant-request` webhook within <strong>7.5 seconds end-to-end</strong>. This limit is fixed and not configurable: the telephony provider enforces a 15-second cap, and Vapi reserves \~7.5 seconds for call setup. The timeout value shown elsewhere in the dashboard does not apply to this webhook.

To avoid timeouts:

* Return quickly with an existing <code>assistantId</code> or a minimal assistant, then enrich context asynchronously after the call starts using <a href="/calls/call-features">Live Call Control</a>.
* Host your webhook close to <code>us-west-2</code> to reduce latency, and target \< \~6s to allow for network jitter.

Respond with either an existing assistant ID, a transient assistant, or transfer destination:

```json
{ "assistantId": "your-saved-assistant-id" }
```

```json
{
  "assistant": {
    "firstMessage": "Hey Ryan, how are you?",
    "model": {
      "provider": "openai",
      "model": "gpt-4o",
      "messages": [
        { "role": "system", "content": "You're Ryan's assistant..." }
      ]
    }
  }
}
```

```json
{ "destination": { "type": "number", "number": "+11234567890" } }
```

#### Transfer only (skip AI)

If you want to immediately transfer the call without using an assistant, return a `destination` in your `assistant-request` response. This bypasses AI handling.

```json
{
  "destination": {
    "type": "number",
    "number": "+14155552671",
    "callerId": "{{phoneNumber.number}}",
    "extension": "101",
    "message": "Connecting you to support."
  }
}
```

```json
{
  "destination": {
    "type": "sip",
    "sipUri": "sip:support@example.com",
    "sipHeaders": { "X-Account": "gold" },
    "message": "Transferring you now."
  }
}
```

When `destination` is present in the `assistant-request` response, the call forwards immediately and <code>assistantId</code>, <code>assistant</code>, <code>squadId</code>, and <code>squad</code> are ignored.
You must still respond within <strong>7.5 seconds</strong>.
To transfer silently, set <code>destination.message</code> to an empty string.
For caller ID behavior, see <a href="/calls/call-features">Call features</a>.

Or return an error message to be spoken to the caller:

```json
{ "error": "Sorry, not enough credits on your account, please refill." }
```

### Status Updates

```json
{
  "message": {
    "type": "status-update",
    "call": { /* Call Object */ },
    "status": "ended"
  }
}
```

* `scheduled`: Call scheduled.
* `queued`: Call queued.
* `ringing`: The call is ringing.
* `in-progress`: The call has started.
* `forwarding`: The call is about to be forwarded.
* `ended`: The call has ended.

### End of Call Report

```json
{
  "message": {
    "type": "end-of-call-report",
    "endedReason": "hangup",
    "call": { /* Call Object */ },
    "artifact": {
      "recording": { /* Recording object with URLs */ },
      "transcript": "AI: How can I help? User: What's the weather? ...",
      "messages": [
        { "role": "assistant", "message": "How can I help?" },
        { "role": "user", "message": "What's the weather?" }
      ]
    }
  }
}
```

### Hang Notifications

```json
{
  "message": {
    "type": "hang",
    "call": { /* Call Object */ }
  }
}
```

Use this to surface delays or notify your team.

### Conversation Updates

Sent when an update is committed to the conversation history.

```json
{
  "message": {
    "type": "conversation-update",
    "messages": [ /* current conversation messages */ ],
    "messagesOpenAIFormatted": [ /* openai-formatted messages */ ]
  }
}
```

### Transcript

Partial and final transcripts from the transcriber.

```json
{
  "message": {
    "type": "transcript",
    "role": "user",
    "transcriptType": "partial",
    "transcript": "I'd like to book...",
    "isFiltered": false,
    "detectedThreats": [],
    "originalTranscript": "I'd like to book..."
  }
}
```

For final-only events, you may receive `type: "transcript[transcriptType=\"final\"]"`.

### Speech Update

```json
{
  "message": {
    "type": "speech-update",
    "status": "started",
    "role": "assistant",
    "turn": 2
  }
}
```

### Assistant Speech Started

Sent as the assistant begins speaking each segment of a turn, synchronized to audio playback. Designed for live captions, karaoke-style word highlighting, and any UI that needs to track what's being spoken in real time.

This event is **opt-in**. Add `"assistant.speechStarted"` to your assistant's `serverMessages` and/or `clientMessages` to receive it.

```json
{
  "message": {
    "type": "assistant.speechStarted",
    "text": "Hello world, how can I help you today?",
    "turn": 2,
    "source": "model",
    "timing": {
      /* optional — shape depends on voice provider, see below */
    }
  }
}
```

| Field    | Description                                                                                             |
| -------- | ------------------------------------------------------------------------------------------------------- |
| `text`   | Full assistant text for the current turn. **Not a delta** — accumulates across events in the same turn. |
| `turn`   | 0-indexed turn number. Multiple events within the same turn share the same `turn`.                      |
| `source` | `"model"` (LLM-generated), `"force-say"` (firstMessage / queued `say` actions), or `"custom-voice"`.    |
| `timing` | Optional. Present when the voice provider supports word-level timing. Shape depends on `timing.type`.   |

#### `timing.type: "word-alignment"` — ElevenLabs

```json
{
  "type": "word-alignment",
  "words": ["Hello", " ", "world"],
  "wordsStartTimesMs": [0, 320, 360],
  "wordsEndTimesMs": [310, 350, 720]
}
```

Per-word timestamps from ElevenLabs' alignment API. Events arrive at audio playback cadence (\~50–200ms apart). The `words[]` array includes space entries with real timing — join them and track a running character cursor to highlight `text` up to that position. No client-side interpolation needed.

#### `timing.type: "word-progress"` — Minimax (with `voice.subtitleType: "word"`)

```json
{
  "type": "word-progress",
  "wordsSpoken": 22,
  "totalWords": 45,
  "segment": "the latest spoken segment text",
  "segmentDurationMs": 3200,
  "words": [
    { "word": "the", "startMs": 0, "endMs": 110 },
    { "word": "latest", "startMs": 110, "endMs": 480 }
  ]
}
```

Cursor-based per-segment progress.

Minimax only attaches subtitle data to the **final audio chunk of each synthesis segment**, so each `assistant.speechStarted` event for a Minimax turn fires near the *end* of that segment's audio playback — not at the start, and not per-word. The `wordsSpoken` value jumps in segment-sized increments, and the `words[]` array carries timestamps for the segment that just *finished*. Use it to retroactively animate that segment, or to extrapolate forward — but it cannot drive smooth real-time highlighting *during* the current segment. For true playback-cadence per-word events, use ElevenLabs.

`totalWords: 0` is a valid sentinel on the very first event of a turn before Minimax confirms its word count — guard against divide-by-zero when computing a progress fraction. See the [Minimax voice provider page](/providers/voice/minimax) for full configuration details.

#### No `timing` field — text-only fallback

All other providers (Cartesia, Deepgram, Azure, OpenAI, Inworld, etc.) emit text-only events with no `timing` object. One event per TTS chunk, gated to actual audio playback. Display `text` as a caption block, or interpolate a word cursor at a flat rate (\~3.5 words/sec) between events for an approximate cursor.

#### Behaviors to be aware of

* **`force-say` events always emit as text-only**, even on ElevenLabs and Minimax — there's no provider-level alignment for forced utterances (firstMessage, queued `say` actions).
* **On user barge-in, no further events fire for the interrupted turn.** Pair with the [`user-interrupted`](#user-interrupted) message and use the most recent `wordsSpoken` (or joined char cursor) to know what was actually spoken.
* **There is no companion `assistant.speechStopped` event.** Use [`speech-update`](#speech-update) (`status: "stopped"`) or watch `turn` increment to detect end-of-turn.
* **Custom voice timing depends on what your voice server returns.** If you return timestamped JSON frames from your custom voice server, those flow through as `timing.words[]`; raw PCM responses produce text-only events.

### Model Output

Tokens or tool-call outputs as the model generates. The optional `turnId` groups all tokens from the same LLM response, so you can correlate output with a specific turn.

```json
{
  "message": {
    "type": "model-output",
    "output": { /* token or tool call */ },
    "turnId": "abc-123"
  }
}
```

### Transfer Destination Request

Requested when the model wants to transfer but the destination is not yet known and must be provided by your server.

```json
{
  "message": {
    "type": "transfer-destination-request",
    "call": { /* Call Object */ }
  }
}
```

This event is emitted only if the assistant did not supply a destination when calling a `transferCall` tool (for example, it did not include a custom parameter like `phoneNumber`). If the assistant includes the destination directly, Vapi will transfer immediately and will not send this webhook.

Respond with a destination and optionally a message:

```json
{
  "destination": { "type": "number", "number": "+11234567890" },
  "message": { "type": "request-start", "message": "Transferring you now" }
}
```

### Transfer Update

Fires whenever a transfer occurs.

```json
{
  "message": {
    "type": "transfer-update",
    "destination": { /* assistant | number | sip */ }
  }
}
```

### User Interrupted

Sent when the user interrupts the assistant. The optional `turnId` identifies the LLM turn that was interrupted, matching the `turnId` on `model-output` messages so you can discard that turn's tokens.

```json
{
  "message": {
    "type": "user-interrupted",
    "turnId": "abc-123"
  }
}
```

### Language Change Detected

Sent when the transcriber switches based on detected language.

```json
{
  "message": {
    "type": "language-change-detected",
    "language": "es"
  }
}
```

### Phone Call Control (Advanced)

When requested in `assistant.serverMessages`, hangup and forwarding are delegated to your server.

```json
{
  "message": {
    "type": "phone-call-control",
    "request": "forward",
    "destination": { "type": "sip", "sipUri": "sip:agent@example.com" }
  }
}
```

```json
{
  "message": {
    "type": "phone-call-control",
    "request": "hang-up"
  }
}
```

### Knowledge Base Request (Custom)

If using `assistant.knowledgeBase.provider = "custom-knowledge-base"`.

```json
{
  "message": {
    "type": "knowledge-base-request",
    "messages": [ /* conversation so far */ ],
    "messagesOpenAIFormatted": [ /* openai-formatted messages */ ]
  }
}
```

Respond with documents (and optionally a custom message to speak):

```json
{
  "documents": [
    { "content": "Return policy is 30 days...", "similarity": 0.92, "uuid": "doc-1" }
  ]
}
```

### Voice Input (Custom Voice Providers)

```json
{
  "message": {
    "type": "voice-input",
    "input": "Hello, world!"
  }
}
```

### Voice Request (Custom Voice Server)

Sent to `assistant.voice.server.url`. Respond with raw 1-channel 16-bit PCM audio at the requested sample rate (not JSON).

```json
{
  "message": {
    "type": "voice-request",
    "text": "Hello, world!",
    "sampleRate": 24000
  }
}
```

### Call Endpointing Request (Custom Endpointing Server)

Sent to `assistant.startSpeakingPlan.smartEndpointingPlan.server.url`.

```json
{
  "message": {
    "type": "call.endpointing.request",
    "messagesOpenAIFormatted": [ /* openai-formatted messages */ ]
  }
}
```

Respond with the timeout before considering the user's speech finished:

```json
{ "timeoutSeconds": 0.5 }
```

### Chat Events

* `chat.created`: Sent when a new chat is created.
* `chat.deleted`: Sent when a chat is deleted.

```json
{ "message": { "type": "chat.created", "chat": { /* Chat */ } } }
```

### Session Events

* `session.created`: Sent when a session is created.
* `session.updated`: Sent when a session is updated.
* `session.deleted`: Sent when a session is deleted.

```json
{ "message": { "type": "session.created", "session": { /* Session */ } } }
```