For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
WebsiteStatusSupportDashboard
DocumentationAPI ReferenceMCPSDKsCLI (new)What's New?
DocumentationAPI ReferenceMCPSDKsCLI (new)What's New?
  • Get started
    • Introduction
    • Phone calls
    • Web calls
    • Vapi Guides
    • Composer
    • CLI quickstart
  • Assistants
    • Quickstart
    • Tools
    • Custom keywords
    • Custom voices
    • Custom transcriber
    • Custom TTS
  • Observability
    • Boards
  • Squads
    • Quickstart
    • Overview
    • Handoff tool
    • Passing data between assistants
  • Best practices
    • Prompting guide
    • Debugging voice agents
    • Enterprise environments (DEV/UAT/PROD)
    • IVR navigation
  • Phone numbers
    • Free Vapi number
    • Inbound SMS
    • Phone Number Hooks
  • Calls
      • Real-time call control
      • Customer join timeout
      • Voicemail detection
      • Call queue management
      • Call concurrency
    • Call end reasons
    • Troubleshoot call errors
  • Outbound Campaigns
    • Quickstart
    • Overview
  • Chat
    • Quickstart
    • Streaming
    • Non-streaming
    • OpenAI compatibility
    • Session management
    • Variable substitution
    • SMS chat
    • Web widget
    • Webhooks
  • Workflows
    • Quickstart
    • Overview
LogoLogo
WebsiteStatusSupportDashboard
On this page
  • Overview
  • What is concurrency?
  • Managing concurrency
  • Outbound campaigns
  • High-volume operations
  • Increase your concurrency limit
  • View concurrency in call responses
  • Example request
  • Example response snippet
  • Field reference
  • Track concurrency with the Analytics API
  • Example request
  • Example response
  • Next steps
CallsIn-call control

Understanding Call Concurrency

Plan, monitor, and scale simultaneous Vapi calls
Was this page helpful?
Edit this page
Previous

Call ended reasons

All possible call ended reason codes and what they mean.
Next
Built with

Overview

Call concurrency represents how many Vapi calls can be active at the same time. Each call occupies one slot, similar to using a finite set of phone lines.

In this guide, you’ll learn to:

  • Understand the default concurrency allocation and when it is usually sufficient
  • Keep outbound and inbound workloads within plan limits
  • Increase reserved capacity directly from the Vapi Dashboard
  • Inspect concurrency data through API responses and analytics queries

What is concurrency?

Every Vapi account includes 10 concurrent call slots by default. When all slots are busy, new outbound dials or inbound connections wait until a slot becomes free.

Inbound agents

Rarely hit concurrency caps unless traffic surges (launches, seasonal spikes).

Outbound agents

More likely to reach limits when running large calling batches.

These limits ensure the underlying compute stays reliable for every customer. Higher concurrency requires reserving additional capacity, which Vapi provides through custom or add-on plans.

Managing concurrency

Outbound campaigns

Batch long lead lists into smaller chunks (for example, 50–100 numbers) and run those batches sequentially. This keeps your peak concurrent calls near the default limit while still working through large sets quickly.

High-volume operations

If you regularly exceed 50,000 minutes per month, talk with Vapi about:

  • Custom plans that include higher baked-in concurrency
  • Add-on bundles that let you purchase extra call lines only when you need them

Use billing reports to pair minute usage with concurrency spikes so you can upgrade before calls are blocked.

Increase your concurrency limit

You can raise or reserve more call lines without contacting support:

  1. Open the Vapi Dashboard.
  2. Navigate to Settings → Billing.
  3. Find Reserved Concurrency (Call Lines).
  4. Increase the limit or purchase add-on concurrency lines.

Changes apply immediately, so you can scale ahead of known traffic surges.

View concurrency in call responses

When you create a call with POST /call, the response includes a subscriptionLimits object that shows the current state of your account.

Example request

$curl 'https://api.vapi.ai/call' \
> -H 'authorization: Bearer {VAPI-PRIVATE-TOKEN}' \
> -H 'content-type: application/json' \
> --data-raw '{
> "assistantId": "4a170597-a0c2-4657-8c32-cb93f080cead",
> "customer": {"number": "+918936850777"},
> "phoneNumberId": "c6ea6cb0-0dfb-4a65-918f-6a33abb54b64"
> }'

Example response snippet

1{
2 "subscriptionLimits": {
3 "concurrencyBlocked": false,
4 "concurrencyLimit": 10,
5 "remainingConcurrentCalls": 9
6 },
7 "id": "019a9046-121e-766d-bd1f-84f3ccc309c1",
8 "status": "queued"
9}

Field reference

  • concurrencyBlocked — true if the call could not start because all slots were full.
  • concurrencyLimit — Total concurrent call slots currently available to your org.
  • remainingConcurrentCalls — How many slots were open at the time you created the call.

Build monitoring around these values to alert when you approach the cap.

Track concurrency with the Analytics API

Use the /analytics endpoint to review historical concurrency usage and spot patterns that justify more capacity.

Example request

$curl 'https://api.vapi.ai/analytics' \
> -H 'authorization: Bearer {VAPI-PRIVATE-TOKEN}' \
> -H 'content-type: application/json' \
> --data-raw '{
> "queries": [{
> "name": "Number of Concurrent Calls",
> "table": "subscription",
> "timeRange": {
> "start": "2025-10-16T18:30:00.000Z",
> "end": "2025-11-17T05:31:10.184Z",
> "step": "day"
> },
> "operations": [{
> "operation": "max",
> "column": "concurrency",
> "alias": "concurrency"
> }]
> }]
> }'

Example response

1[{
2 "name": "Number of Concurrent Calls",
3 "timeRange": {
4 "start": "2025-10-16T18:30:00.000Z",
5 "end": "2025-11-17T05:31:10.184Z",
6 "step": "day",
7 "timezone": "UTC"
8 },
9 "result": [
10 { "date": "2025-11-05T00:00:00.000Z", "concurrency": 0 },
11 { "date": "2025-11-10T00:00:00.000Z", "concurrency": 1 },
12 { "date": "2025-11-17T00:00:00.000Z", "concurrency": 1 }
13 ]
14}]

Adjust the timeRange.step to inspect usage by hour, day, or week. Peaks that align with campaign launches, seasonality, or support events highlight when you should reserve additional call lines.

Next steps

  • Call queue management: Build a Twilio queue to buffer calls when you hit concurrency caps.
  • Outbound campaign planning: Design outbound strategies that pair batching with analytics.
  • Enterprise plans: Review larger plans that include higher default concurrency.